Bob Butler nuclear engineer, musician (www.pleasantonband.org), former city councilman and Mayor of Pleasanton, California died October fifth https://www.pleasantonweekly.com/news/2021/10/14/what-a-week-remembering-bob-butler-former-pleasanton-mayor-and-councilman. He helped me get traffic counts data from the Pleasanton Traffic Department.
The objective is to estimate origin-destination (O-D) matrix proportions and travel time distributions, without individual data, and use the matrix and distributions to help manage traffic. The traffic model is a network of k-dimension M/G/infinity nodes and edges connecting them. Data are transient arrivals and departure counts, without individual vehicle identification. Within nodes individuals turn or go straight randomly. Reconstruct route information by product and convolution.
Pleasanton is in the armpit of the intersection of two major freeways: in the southeast corner of the highways 580 and 680 intersection. Freeway drivers cut through Pleasanton to avoid the congested freeway interchange. In year 2000, Pleasanton assigned city cars to follow cars getting off freeways and driving into the city to see where they went and whether they went through. Drivers complained about being followed. Popular sentiment and complaints stopped city cars from tracking people driving through town.
Pleasanton changed traffic signal timing near freeway exits to discourage drivers from driving through town. Stopped cars backed up onto the freeways and caused accidents. I offered to help.
Pleasanton traffic department gave me traffic counts in 15 minute intervals from major intersections around edges of city. The intersection data told how many cars go straight, turn left, or turn right, minute by minute. I produced nonparametric least squares estimates of vehicle O-D matrixes and travel time distributions. [George 2000]
The Pleasanton O-D and travel-time traffic model is a network of M(t)/G/Infinity self-service queuing systems [Keilson and Servi] with non-stationary Poisson arrivals. Analyses of this kind of model have been called “network tomography,” https://en.wikipedia.org/wiki/Network_tomography [Castro et al.]. Leonard Kleinrock, a fellow UC Berkeley grad student, made a career analyzing the packet-switching Internet as networks of queues, https://nationalmedals.org/laureate/leonard-kleinrock/.
Table 1. Typical Pleasanton data, vehicle counts starting at beginning of rush hours. I combined several intersections into N, S, E, and W.
Input | Output | |||||||
Time | N | S | E | W | N | S | E | W |
Start | 15 | 104 | 54 | 16 | 4 | 47 | 46 | 8 |
0:15 | 12 | 73 | 37 | 14 | 3 | 60 | 43 | 10 |
0:30 | 17 | 74 | 50 | 13 | 3 | 42 | 18 | 9 |
0:45 | 8 | 56 | 41 | 11 | 3 | 31 | 29 | 10 |
1:00 | 9 | 36 | 34 | 13 | 2 | 21 | 27 | 6 |
1:15 | 5 | 39 | 22 | 6 | 0 | 21 | 28 | 6 |
1:30 | 8 | 31 | 23 | 1 | 2 | 27 | 21 | 5 |
Table 2. Simple O-D matrix. Body is proportion of inputs “From…” that go out “To…” “From 0” and “To 0” are those that originate in Pleasanton and those whose destination is within Pleasanton.
O from\D to-> | From 0 | From N | From S | From E | From W |
To 0 | 0.0000 | 0.8640 | 0.0000 | 0.0000 | 0.7801 |
To N | 0.2136 | 0.0000 | 0.0135 | 0.0000 | 0.0721 |
To S | 0.1801 | 0.0285 | 0.0000 | 1.0000 | 0.0000 |
To E | 0.1755 | 0.0177 | 0.2679 | 0.0000 | 0.1479 |
To W | 0.4308 | 0.0899 | 0.7186 | 0.0000 | 0.0000 |
Sum | 1 | 1 | 1 | 1 | 1 |
Table 3. Travel-time distribution estimates. Data counts were in 15-minute intervals, so could not quantify discrete probability density more finely.
Minutes | Pleasanton origin | Through Pleasanton |
0-15 | 1 | 0.979148 |
15-30 | 0 | 0 |
30-45 | 0 | 0.020852 |
The Pleasanton traffic manager didn’t understand O-D matrixes, probability distributions, or their use for managing traffic. In year 2009, Pleasanton stationed cheap labor at major intersections around edges of city to record license numbers of cars entering or leaving Pleasanton. That gave some sample data to estimate travel times through Pleasanton by license number. Some tollway operators are doing that using license-plate readers, to cite speeders or suspend their toll passes.
Sample or population?
It’s human nature to doubt statistically significant conclusions based on a sample that is a small fraction of the population.
Air travel surveys ~10%, on the day surveyors choose. Commuter surveys have non-response, bias. People asked, “Why are they surveying me?” I want Origin-Destination matrix population proportions and travel time distributions, without sample uncertainty.
The M/G/infinity self-service system service-time distribution G(t) can be estimated, without customer identification! Imagine the supermarket analog of estimating the probability distribution of time in store. Observe supermarket arrivals and departures, starting from the time the doors open but without identifying individual shoppers. The reliability analog is to regard products or parts as entering self-service when sold or installed, departing when they fail (complaints, failures, return, repair, replacements, spares sales, etc.): data required by GAAP! The traffic analog is to regard traffic as going into and coming out of a network of M(t)/G/infinity self-service systems plus some originating or terminating in Pleasanton. That uses population data!
Assume Poisson input at rate λ, service time distribution is G(t), and self-service. Output is Poisson at rate λG(t). Arrival and departure times or counts, without identification, are statistically sufficient to make nonparametric estimates of service-time distribution G(t). Same is true for non-stationary Poisson input M(t)/G/infinity systems. [https://lucas-accendo-site-speed.sprod01.rmkr.net/estimation-of-a-hidden-service-time-distribution-of-an-mt-g-%e2%88%9e-self-service-system/#more-450442]
Tandem M/G/infinity Self-Service Systems
Imagine two M/G/infinity self-service systems with the output of one being the input to the second. Input Poisson at rate λG1(t) to a second M(t)/G2/infinity self-service system. Output is Poisson at rate
∫λG1(u)G2(t-u)dG1(u), (INTEGRATE[λG1(u)G2(t-u)dG1(u)]),
where the integral is from 0 to t.
Estimate distributions G1(t) and G2(t) from input, intermediate, and output observation counts. Refer to diffusion approximation to tandem or parallel queues, [George, 1973], for generalizations or approximations to non-Poisson inputs.
Simple Traffic M/G2/∞2/p Model
Imagine two crossing M/G/infinity self-service systems, one North-South and the other East-West. Customers from N-S may switch to E-W and vice-versa with probability p (Figure 1). Assume p and travel-time distribution G(t) are the same regardless of directions. Assuming symmetry gives you twice as much data to estimate p and G(t)
Figure 1. Simple intersection modeled as two queuing systems
Observe 2 inputs and 2 outputs. Estimate average arrival rates λ1 and λ2 from input counts. Estimate p and G(t) from output counts t=1,2,….
r1(t) = ((1-p)λ1 +pλ2)G(t)
r2(t) = ((1-p)λ2 +pλ1)G(t)
M/G2/∞2/p with one observation at time t
Suppose you observe output counts r1(t) and r2(t) at time t, then
p = (λ2*r1(t) – λ1*r2(t)) / ((-λ1 + λ2)*(r1(t) + r2(t))) and
G(t) = (r1(t) + r2(t))/(λ1 + λ2).
Solve the two linear equations in two unknowns obtained by equating observed and expected output rates. Estimate lambdas from input rates:
r1(t) = 4, r2(t) = 3, λ1 = 10, λ2 = 20 gives
p = 5/7 and G(t) = 7/30.
Solutions for several observations
Use least squares estimation when more than one observation gives multiple solutions. If underdetermined, minimize sums of squared errors, SUM[(observed-expected)2]. For example:
r1(1) = 3, r2(1)= 4, r1(2) = 6, r2(2)= 8, l1 = 10, l2 = 20 gives
p = 0.285714, G(1) = 0.233333, G(2) = 0.466667.
With multiple observations and different turn rates and travel times, p1, p2, G11(t), G12 (t), G21(t), G22(t) depend on whether you want answers at time t or for all (i,j) pairs. If the system of equations is overdetermined, take more observations. Solutions must be constrained to have 0<p<1 and G(t) increasing in t. I used Excel Solver
M(t)/Gk/∞k network with switching
Imagine a network of intersection nodes and connecting routes. The whole thing could be modeled as a (filtration of) Markov process using states as node arrival and departure counts at arrival and departure times. Observe node inputs and outputs; inputs can have nonstationary rates. Assume stationary switching and travel time distributions. Route frequency is product of switch probabilities (assuming independence). Travel time is convolution of intra-node travel time distributions.
Summary
Estimation of O-D matrix proportions and travel time distributions doesn’t require customer or vehicle identification. Population data beats sample data. Least squares method works for transient processes: e.g. commutes, BART, other daily cycles and traffic light cycles.
References
Sharminda Bera and K. V. Krishna Rao , “Estimation of origin-destination matrix from traffic counts: the state of the art,” European Transport no. 49 (2011): 3-23
George, L. L., “Origin-Destination and Travel-Time Distribution, Without Surveys,” INFORMS, Salt Lake City, May 2000
George, L. L., “Diffusion Approximation for Two Channel, Poisson-Exponential Service Systems with Dependence,” Ph. D. thesis, University of California, Berkeley; Advisor, Richard E. Barlow, 1973
J. Keilson and L. D. Servi, “A Distributional Form of Little’s Law”, 1965, also Operations Research Letters, Volume 7, Issue 5, 1988, Pages 223-227, ISSN 0167-6377, https://doi.org/10.1016/0167-6377(88)90035-1.
Rui Castro, Mark Coates, Gang Liang, Robert Nowak, and Bin Yu, “Network Tomography: Recent Developments,” 2003
Leave a Reply