normal distribution random number algorithm


I need to develop a method that basically returns a random number based on specific parameters as defined below.

The context is that this method wil go in a wider implementation of a discrete-even simulation model that reflect a specific traffic system.

Basically, the input requires to be of type double. The output must be of type double as well. I have the graph below which is to be used to development of the algorithm to return a random number.

As shown in the image, service time changes as time passes. between 06.00am and 07.10 (70 mins), mean service time is 1. Between 07.10 and 7.35 (70 mins till 95 mins), mean service time is 3 etc. I would require that the random number returned displays this characteristics, i.e. as from time NOW = 0 (mins) to time now = 70 (mins), there is a mean service time of 1 etc.


I am not exactly sure how to go along and develop this. Does it make sense to develop an array with probabilities?

Any help would be appreciated.


  • If you wanted to fit it to your data exactly, then just take a random number between 0 and the total sum over all of your buckets, and then return the bucket from the cumulative histogram with size corresponding to that number.

    If you wanted to model your data assuming that your data follows a normal distribution, then you can use this observed data to estimate the parameters of that distribution, via the maximum likelihood estimates for mean and sigma, which are then calculated the normal way that you learned back in high school. Once you have the parameters of your normal distribution, you can fit a uniform distribution to it by basically inverting the normal c.d.f. (note the analogy to the first option). A quick Google revealed that it's probably easiest to just transform [icode]generator.nextGaussian();[/icode] according to the parameters.

    We can go into more depth once I get a better understanding of which of this you already know and which of this you're having difficulty figuring out how to implement.

    As a quick kind of high level view, the second approach generalizes better in the event of noisy data, but worse in the event of incorrect or aggressive assumptions about how that data is modeled. The first approach assumes that you have a lot of observed data and just need to make a small number of predictions without caring much about generalization or making assumptions about the distribution of your data.
Sign In or Register to comment.