Questions

As proposed originally, we would like to understand the following questions:

  1. Examine how the number of open turnstiles affects the wait time required to enter the park and in so doing, find the optimum number of turnstiles to ensure that no agent waits for more than 10 minutes.

  2. Every guest that enters the park must choose a number of rides. In our simulation, we generate this number from a distribution. Compare the generated values to the theoretic value and confirm that our simulation is producing results in agreement with the theory.

  3. Based upon the number of turnstiles required to assure that no agent waits for more than 10 minutes, show the distribution of time spent in the park.

Setup

Below is some necessary code to ensure proper code setup.

#define the file base
file_base = "~/Datasets/Disney-Sim/"

#import the departure modeling algorithm
knitr::read_chunk('../../R/modeling_departures.R')

Question 1

We begin by examining the first question.

Experiment Setup

For this experiment, we are going vary the number of turnstiles from 1 to 50 and determine the first turnstile count where the average wait time is below the 10 minute threshold. Because of stochastic variation, we perform a Monte Carlo simulation with two (2) trials per turnstile count. (Of course, this is not nearly enough simulation trials, but due to time constraints it is the best that can be achieved.) The end result is a collection of 100 simulation files with each turnstile count (1 - 50) being simulated twice.

Once the simulation files have been generated, for each simulation file we extract the rows where an agent entered the park (event_enter_park). Once these rows are extracted, we subtract the arrival times (a field in the row). We log whether the maximum wait times are less than the threshold value and proceed to the next simulation file.

Testing the Simulation Output

As we built the simulation, we tested the components of the simulation. The idea behind doing this is that if the individual components are working properly, the whole system, if properly configured, should work properly. During this integration we noticed the following bug:

  • Table-service dining experiences have a capacity limit in our simulation. Initially, as configured, the selected dining location was added to the dining_visited field prior to actually confirming if availability was present. Issue was corrected and simulations re-ran.

Even though we did perform the testing along the way, we perform a quick sanity check. We should always have the same number of agents arriving as these values where fixed in our simulation (after generating the initial arrival distribution), and therefore all files should have the same number of agents operating.

Please note, that due to the size of the simulation output files produced (~4 GB), the datasets cannot be shipped with the code. Please adjust the file_base variable in the setup code (located at the top of this page) in order to properly locate the simulation files. These simulation output can be located at the data.world.

library(foreach)
library(progress)

#define the number of agents to expect per simulation
num_of_agents = 44729

#instantiate a new progress bar
pb <- progress::progress_bar$new(total=100)

#we have 100 files to process (50 * 2)
comparison <- foreach(i = 1:50) %:%
  foreach(j = 1:2) %do% {
    
    #read the data in
    results <- read.csv(file.path(file_base, paste0("sim_log_", i, "_", j, ".txt")))
    
    #increment the progress bar
    #we do this before checking the number of agents so that the equality results are returned
    pb$tick()
    
    #check that the required number of agents are present
    num_of_agents == length(unique(results$agent_id))
}
## [>-----------------------------------------------------------------] 2%
## [=>----------------------------------------------------------------] 3%
## [==>---------------------------------------------------------------] 4%
## [==>---------------------------------------------------------------] 5%
## [===>--------------------------------------------------------------] 6%
## [====>-------------------------------------------------------------] 7%
## [====>-------------------------------------------------------------] 8%
## [=====>------------------------------------------------------------] 9%
## [======>-----------------------------------------------------------] 10%
## [======>-----------------------------------------------------------] 11%
## [=======>----------------------------------------------------------] 12%
## [========>---------------------------------------------------------] 13%
## [========>---------------------------------------------------------] 14%
## [=========>--------------------------------------------------------] 15%
## [==========>-------------------------------------------------------] 16%
## [==========>-------------------------------------------------------] 17%
## [===========>------------------------------------------------------] 18%
## [============>-----------------------------------------------------] 19%
## [============>-----------------------------------------------------] 20%
## [=============>----------------------------------------------------] 21%
## [==============>---------------------------------------------------] 22%
## [==============>---------------------------------------------------] 23%
## [===============>--------------------------------------------------] 24%
## [===============>--------------------------------------------------] 25%
## [================>-------------------------------------------------] 26%
## [=================>------------------------------------------------] 27%
## [=================>------------------------------------------------] 28%
## [==================>-----------------------------------------------] 29%
## [===================>----------------------------------------------] 30%
## [===================>----------------------------------------------] 31%
## [====================>---------------------------------------------] 32%
## [=====================>--------------------------------------------] 33%
## [=====================>--------------------------------------------] 34%
## [======================>-------------------------------------------] 35%
## [=======================>------------------------------------------] 36%
## [=======================>------------------------------------------] 37%
## [========================>-----------------------------------------] 38%
## [=========================>----------------------------------------] 39%
## [=========================>----------------------------------------] 40%
## [==========================>---------------------------------------] 41%
## [===========================>--------------------------------------] 42%
## [===========================>--------------------------------------] 43%
## [============================>-------------------------------------] 44%
## [=============================>------------------------------------] 45%
## [=============================>------------------------------------] 46%
## [==============================>-----------------------------------] 47%
## [===============================>----------------------------------] 48%
## [===============================>----------------------------------] 49%
## [================================>---------------------------------] 50%
## [=================================>--------------------------------] 51%
## [=================================>--------------------------------] 52%
## [==================================>-------------------------------] 53%
## [===================================>------------------------------] 54%
## [===================================>------------------------------] 55%
## [====================================>-----------------------------] 56%
## [=====================================>----------------------------] 57%
## [=====================================>----------------------------] 58%
## [======================================>---------------------------] 59%
## [=======================================>--------------------------] 60%
## [=======================================>--------------------------] 61%
## [========================================>-------------------------] 62%
## [=========================================>------------------------] 63%
## [=========================================>------------------------] 64%
## [==========================================>-----------------------] 65%
## [===========================================>----------------------] 66%
## [===========================================>----------------------] 67%
## [============================================>---------------------] 68%
## [=============================================>--------------------] 69%
## [=============================================>--------------------] 70%
## [==============================================>-------------------] 71%
## [===============================================>------------------] 72%
## [===============================================>------------------] 73%
## [================================================>-----------------] 74%
## [=================================================>----------------] 75%
## [=================================================>----------------] 76%
## [==================================================>---------------] 77%
## [==================================================>---------------] 78%
## [===================================================>--------------] 79%
## [====================================================>-------------] 80%
## [====================================================>-------------] 81%
## [=====================================================>------------] 82%
## [======================================================>-----------] 83%
## [======================================================>-----------] 84%
## [=======================================================>----------] 85%
## [========================================================>---------] 86%
## [========================================================>---------] 87%
## [=========================================================>--------] 88%
## [==========================================================>-------] 89%
## [==========================================================>-------] 90%
## [===========================================================>------] 91%
## [============================================================>-----] 92%
## [============================================================>-----] 93%
## [=============================================================>----] 94%
## [==============================================================>---] 95%
## [==============================================================>---] 96%
## [===============================================================>--] 97%
## [================================================================>-] 98%
## [================================================================>-] 99%
## [==================================================================] 100%
#unlist the results and check that all are TRUE
comparison <- unlist(comparison)
all(comparison == TRUE)
## [1] TRUE

Results

The simulation was ran in parallel on an Intel i5 3470 (3.6GHz base) with 12GB of RAM. Running the simulation with the hardware specified took approximately 30 hours to complete.

With the simulation logs generated, we will now proceed to identify the first turnstile count where the wait time is less than 10 minutes. To do this we load each simulation log, extract the rows where agents entered the park (event code: event_enter_park) and subtract the associated arrival of each agent from the time they entered the park. We do this for all 100 simulations and analyze the output to determine which number of turnstiles produces wait times below 10 minutes. Please note that because we performed a Monte Carlo simulation, both simulation trials must have wait times below the threshold to be considered a success.

library(foreach)
library(progress)

#instantiate a progress bar
pb <- progress::progress_bar$new(total=100)

#set the threshold
threshold = 10 * 60 #10 minutes in seconds

#we have 100 files to go through
wait_time_results <- foreach(i = 1:50) %:%
  foreach(j = 1:2) %do% {
    
    #load the results
    results <- read.csv(file.path(file_base, paste0("sim_log_", i, "_", j, ".txt")))
    
    #extract only the event_enter_park (event_enter_park is defined as 2 in the event-codes)
    event2 <- results[which(results$trig_event_type == 2),]
    
    #compute the wait times by taking the current time (the time the agent entered the park)
    #and subtracting off the arrival time
    wait_times <- event2$cur_time - event2$arrive_park
    
    #update progress bar
    pb$tick()
    
    #sanity check to assure no negative wait times
    stopifnot(all(wait_times > 0))
    
    #check whether the wait times are below the threshold
    max(wait_times) < threshold
    
  }
## [>-----------------------------------------------------------------] 2%
## [=>----------------------------------------------------------------] 3%
## [==>---------------------------------------------------------------] 4%
## [==>---------------------------------------------------------------] 5%
## [===>--------------------------------------------------------------] 6%
## [====>-------------------------------------------------------------] 7%
## [====>-------------------------------------------------------------] 8%
## [=====>------------------------------------------------------------] 9%
## [======>-----------------------------------------------------------] 10%
## [======>-----------------------------------------------------------] 11%
## [=======>----------------------------------------------------------] 12%
## [========>---------------------------------------------------------] 13%
## [========>---------------------------------------------------------] 14%
## [=========>--------------------------------------------------------] 15%
## [==========>-------------------------------------------------------] 16%
## [==========>-------------------------------------------------------] 17%
## [===========>------------------------------------------------------] 18%
## [============>-----------------------------------------------------] 19%
## [============>-----------------------------------------------------] 20%
## [=============>----------------------------------------------------] 21%
## [==============>---------------------------------------------------] 22%
## [==============>---------------------------------------------------] 23%
## [===============>--------------------------------------------------] 24%
## [===============>--------------------------------------------------] 25%
## [================>-------------------------------------------------] 26%
## [=================>------------------------------------------------] 27%
## [=================>------------------------------------------------] 28%
## [==================>-----------------------------------------------] 29%
## [===================>----------------------------------------------] 30%
## [===================>----------------------------------------------] 31%
## [====================>---------------------------------------------] 32%
## [=====================>--------------------------------------------] 33%
## [=====================>--------------------------------------------] 34%
## [======================>-------------------------------------------] 35%
## [=======================>------------------------------------------] 36%
## [=======================>------------------------------------------] 37%
## [========================>-----------------------------------------] 38%
## [=========================>----------------------------------------] 39%
## [=========================>----------------------------------------] 40%
## [==========================>---------------------------------------] 41%
## [===========================>--------------------------------------] 42%
## [===========================>--------------------------------------] 43%
## [============================>-------------------------------------] 44%
## [=============================>------------------------------------] 45%
## [=============================>------------------------------------] 46%
## [==============================>-----------------------------------] 47%
## [===============================>----------------------------------] 48%
## [===============================>----------------------------------] 49%
## [================================>---------------------------------] 50%
## [=================================>--------------------------------] 51%
## [=================================>--------------------------------] 52%
## [==================================>-------------------------------] 53%
## [===================================>------------------------------] 54%
## [===================================>------------------------------] 55%
## [====================================>-----------------------------] 56%
## [=====================================>----------------------------] 57%
## [=====================================>----------------------------] 58%
## [======================================>---------------------------] 59%
## [=======================================>--------------------------] 60%
## [=======================================>--------------------------] 61%
## [========================================>-------------------------] 62%
## [=========================================>------------------------] 63%
## [=========================================>------------------------] 64%
## [==========================================>-----------------------] 65%
## [===========================================>----------------------] 66%
## [===========================================>----------------------] 67%
## [============================================>---------------------] 68%
## [=============================================>--------------------] 69%
## [=============================================>--------------------] 70%
## [==============================================>-------------------] 71%
## [===============================================>------------------] 72%
## [===============================================>------------------] 73%
## [================================================>-----------------] 74%
## [=================================================>----------------] 75%
## [=================================================>----------------] 76%
## [==================================================>---------------] 77%
## [==================================================>---------------] 78%
## [===================================================>--------------] 79%
## [====================================================>-------------] 80%
## [====================================================>-------------] 81%
## [=====================================================>------------] 82%
## [======================================================>-----------] 83%
## [======================================================>-----------] 84%
## [=======================================================>----------] 85%
## [========================================================>---------] 86%
## [========================================================>---------] 87%
## [=========================================================>--------] 88%
## [==========================================================>-------] 89%
## [==========================================================>-------] 90%
## [===========================================================>------] 91%
## [============================================================>-----] 92%
## [============================================================>-----] 93%
## [=============================================================>----] 94%
## [==============================================================>---] 95%
## [==============================================================>---] 96%
## [===============================================================>--] 97%
## [================================================================>-] 98%
## [================================================================>-] 99%
## [==================================================================] 100%
#unlist the results
wait_time_results <- unlist(wait_time_results)

#build the wait times into a matrix of trials
results <- matrix(wait_time_results, ncol=2, byrow = T)
colnames(results) <- c("Trial 1", "Trial 2")

#display
DT::datatable(results, rownames = paste("Turnstile Count @", 1:50))

FALSE indicates that the maximum wait time is not below the threshold whereas TRUE indicates that the maximum wait time is below the threshold. From this table, one can easily see that having 28 open turnstiles assures that no agent waits longer than 10 minutes in line.

Question 2

We now examine the second question.

Experiment Setup

We begin by computing a theoretic value for the number of rides to visit. The number of rides to visit is given by the following line of code floor(rnorm(n, route_length_mean, route_length_std)) in the R/modeling_departures.R file. In the simulation, route_length_mean = 10 and route_length_std = 2. Mathematically, we say

\[ x = \lfloor \mathcal{N(n, 10, 2)} \rfloor \]

where \(x\) is the number of rides to visit and \(n\) is the number of random deviates to generate. It should be noted that this value is stochastic, however, because of the properties of the \(\mathcal{N}\) a central limit should exist. For each simulation output file produced (same files as produced for Question 1) we will identify all the arrival rows (event code: event_arrive_park). Once the arrival rows have been identified, we will extract the number of rides for each agent. We will then perform a t-test on the number of rides where \(H_0 = \mu_i = \mu_j\), \(H_{A} = \mu_i \neq \mu_j\) and \(\alpha = 0.05\). We process each simulation file independently. The results are detailed below.

Results

We begin by bringing in the route length generation function:

generate_route_length <- function(n, route_length_mean, route_length_std){
  floor(rnorm(n, route_length_mean, route_length_std))
}

Next, we process all of the simulation files and compute a t-test over the number of rides to visit and the theoretic mean.

Note, that because the theo_mean calculation is stochastic in nature, the exact value returned will vary between trials. Nevertheless,

library(foreach)
library(progress)

#set seed to maintain reproducability
set.seed(1234)

#instantiate a progress bar
pb <- progress::progress_bar$new(total=100)

#set theoretic mean
#44729 comes from the fact that we generated 44729 agents during the simulation
theo_mean <- mean(generate_route_length(44729, 10, 2))

#set confidence threshold
alpha = 0.05

#compute how many times we have a p-value that indicates we should reject the null
#because of the stochastic nature of the theo_mean computation, we will not always
#fail to reject the null (the desired outcome)
reject_null <- foreach(i = 1:50) %:%
  foreach(j = 1:2) %do% {
    
    #load the results
    results <- read.csv(file.path(file_base, paste0("sim_log_", i, "_", j, ".txt")))
    
    #grab the arrive at park events (which are defined to 1 in the event-codes)
    event1 <- results[which(results$trig_event_type == 1),]
    
    #compute a t.test on the generated number of arrivals
    t <- t.test(event1$rem_attractions, mu = theo_mean)
    
    #increase the progress bar
    pb$tick()
    
    #determine the results of the hypothesis test
    t$p.value <= alpha
  }
## [>------------------------------------------------------------------] 2%
## [=>-----------------------------------------------------------------] 3%
## [==>----------------------------------------------------------------] 4%
## [==>----------------------------------------------------------------] 5%
## [===>---------------------------------------------------------------] 6%
## [====>--------------------------------------------------------------] 7%
## [====>--------------------------------------------------------------] 8%
## [=====>-------------------------------------------------------------] 9%
## [======>------------------------------------------------------------] 10%
## [======>------------------------------------------------------------] 11%
## [=======>-----------------------------------------------------------] 12%
## [========>----------------------------------------------------------] 13%
## [========>----------------------------------------------------------] 14%
## [=========>---------------------------------------------------------] 15%
## [==========>--------------------------------------------------------] 16%
## [==========>--------------------------------------------------------] 17%
## [===========>-------------------------------------------------------] 18%
## [============>------------------------------------------------------] 19%
## [============>------------------------------------------------------] 20%
## [=============>-----------------------------------------------------] 21%
## [==============>----------------------------------------------------] 22%
## [==============>----------------------------------------------------] 23%
## [===============>---------------------------------------------------] 24%
## [================>--------------------------------------------------] 25%
## [================>--------------------------------------------------] 26%
## [=================>-------------------------------------------------] 27%
## [==================>------------------------------------------------] 28%
## [==================>------------------------------------------------] 29%
## [===================>-----------------------------------------------] 30%
## [====================>----------------------------------------------] 31%
## [====================>----------------------------------------------] 32%
## [=====================>---------------------------------------------] 33%
## [======================>--------------------------------------------] 34%
## [======================>--------------------------------------------] 35%
## [=======================>-------------------------------------------] 36%
## [========================>------------------------------------------] 37%
## [========================>------------------------------------------] 38%
## [=========================>-----------------------------------------] 39%
## [==========================>----------------------------------------] 40%
## [==========================>----------------------------------------] 41%
## [===========================>---------------------------------------] 42%
## [============================>--------------------------------------] 43%
## [============================>--------------------------------------] 44%
## [=============================>-------------------------------------] 45%
## [==============================>------------------------------------] 46%
## [==============================>------------------------------------] 47%
## [===============================>-----------------------------------] 48%
## [================================>----------------------------------] 49%
## [=================================>---------------------------------] 50%
## [=================================>---------------------------------] 51%
## [==================================>--------------------------------] 52%
## [===================================>-------------------------------] 53%
## [===================================>-------------------------------] 54%
## [====================================>------------------------------] 55%
## [=====================================>-----------------------------] 56%
## [=====================================>-----------------------------] 57%
## [======================================>----------------------------] 58%
## [=======================================>---------------------------] 59%
## [=======================================>---------------------------] 60%
## [========================================>--------------------------] 61%
## [=========================================>-------------------------] 62%
## [=========================================>-------------------------] 63%
## [==========================================>------------------------] 64%
## [===========================================>-----------------------] 65%
## [===========================================>-----------------------] 66%
## [============================================>----------------------] 67%
## [=============================================>---------------------] 68%
## [=============================================>---------------------] 69%
## [==============================================>--------------------] 70%
## [===============================================>-------------------] 71%
## [===============================================>-------------------] 72%
## [================================================>------------------] 73%
## [=================================================>-----------------] 74%
## [=================================================>-----------------] 75%
## [==================================================>----------------] 76%
## [===================================================>---------------] 77%
## [===================================================>---------------] 78%
## [====================================================>--------------] 79%
## [=====================================================>-------------] 80%
## [=====================================================>-------------] 81%
## [======================================================>------------] 82%
## [=======================================================>-----------] 83%
## [=======================================================>-----------] 84%
## [========================================================>----------] 85%
## [=========================================================>---------] 86%
## [=========================================================>---------] 87%
## [==========================================================>--------] 88%
## [===========================================================>-------] 89%
## [===========================================================>-------] 90%
## [============================================================>------] 91%
## [=============================================================>-----] 92%
## [=============================================================>-----] 93%
## [==============================================================>----] 94%
## [===============================================================>---] 95%
## [===============================================================>---] 96%
## [================================================================>--] 97%
## [=================================================================>-] 98%
## [=================================================================>-] 99%
## [===================================================================] 100%
#unlist the results
reject_null <- unlist(reject_null)

#calculate the rejection percentage (the reject the null condition)
length(which(reject_null == TRUE))/length(reject_null)
## [1] 0.13

As we can see, 13% of the simulation trials (13 trials) had a mean that was significantly different from the desired mean. We consider this percentage rather and considering the issues discussed regarding the stochastic nature of the theoretical mean computation, consider the simulation trials a success.

Question 3

We now consider the last question.

Experiement Setup

Using the same simulation results produced for Question 1, we will identify the simulation results which prevent a wait time from exceeding 10 minutes and minimize the number of open turnstiles (using the answer from Question 1). After doing this, we will process the data by identifying all the rows in which an agent leaves the park (event code: event_leave_park) and subtracting the time of arrival (a field of the returned rows). We will then build a histogram - a non-parametric model - detailing the results. No formal statistical test is necessary as we are simply presenting results.

As can be seen in Question 1, having 28 turnstiles open will assure, given the arrivals simulated with, that no agent waits more than 10 minutes.

Results

We present the histogram below. As detailed in the experiment setup section, we begin by reading in the file, extracting the park leaving events (event_leave_park) and determining the time spent in the park by subtracting the arrival time from the leaving time.

#define the file base
file_base = "~/Datasets/Disney-Sim/"

#load the results
results <- read.csv(file.path(file_base, paste0("sim_log_", 28, "_", 1, ".txt")))
    
#extract only the event_leave_park event (defined as 30 in the event-codes)
event30<- results[which(results$trig_event_type == 30),]

#determine the time spent in the park
time_spent <- event30$cur_time - event30$arrive_park

hist(time_spent, xlab = "Time Spent in Park (seconds)", ylab="Frequency", main="Time Spent in Park")