# Hướng dẫn generate cumulative distribution python

View Discussion

Nội dung chính

• Method 1: Using the histogram
• Method 2: Data sort
• How do you calculate cumulative distribution in Python?
• How do you find the cumulative distribution?
• How do you calculate CDF from data?
• How do you find the empirical cumulative distribution in Python?

Improve Article

Save Article

• Discuss
• View Discussion

Improve Article

Save Article

Prerequisites: Matplotlib

Matplotlib is a library in Python and it is a numerical — mathematical extension for the NumPy library.  The cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less
than or equal to x.

Properties of CDF:

• Every cumulative distribution function F(X) is non-decreasing
• If maximum value of the cdf function is at x, F(x) = 1.
• The CDF ranges from 0 to 1.

### Method 1: Using the histogram

Nội dung bài viết

CDF can be calculated using PDF (Probability Distribution Function). Each point of random variable will contribute cumulatively to form CDF.

Example :

A combination set containing 2 balls which can be either red or blue can be in the following set.

{RR, RB, BR, BB}

t -> No of red balls.

P(x = t) -> t = 0 : 1 / 4 [BB]

t = 1 : 2 / 4 [RB, BR]

t = 2 : 1 / 4 [RR]

CDF :

F(x) = P(x<=t)

x = 0 : P(0)               -> 1 / 4

x = 1
: P(1) + P(0)        -> 3 / 4

x = 2 : P(2) + P(1) + P(0) -> 1

Approach

• Import modules
• Declare number of data points
• Initialize random values
• Plot histogram using above data
• Get histogram data
• Finding PDF using histogram data
• Calculate CDF
• Plot CDF
Xem thêm:  Hướng dẫn __set_name__ python

Example:

## Python3

`import` `numpy as np`

`import` `matplotlib.pyplot as plt`

`import`
`pandas as pd`

`%``matplotlib inline`

`N ``=` `500`

`data ``=` `np.random.randn(N)`

`count, bins_count ``=` `np.histogram(data, bins``=``10``)`

`pdf ``=` `count ``/` `sum``(count)`

`cdf ``=` `np.cumsum(pdf)`

`plt.plot(bins_count[``1``:], pdf, color``=``"red"``, label``=``"PDF"``)`

`plt.plot(bins_count[``1``:], cdf, label``=``"CDF"``)`

`plt.legend()`

Output:

Histogram plot of the PDF and CDF :

Plotted CDF:

CDF plotting

### Method 2: Data sort

This method depicts how CDF can be calculated and plotted using sorted data. For this, we first sort the data and then handle further calculations.

Approach

• Import module
• Declare number of data points
• Create data
• Sort data in ascending order
• Get CDF
• Plot CDF
• Display plot

Example:

## Python3

`import` `numpy as np`

`import` `matplotlib.pyplot as plt`

`import` `pandas as pd`

`%``matplotlib inline`

`N ``=` `500`

`data ``=` `np.random.randn(N)`

`x ``=` `np.sort(data)`

`y ``=` `np.arange(N) ``/` `float``(N)`

`plt.xlabel(``'x-axis'``)`

`plt.ylabel(``'y-axis'``)`

`plt.title(``'CDF using sorting the data'``)`

`plt.plot(x, y, marker``=``'o'``)`

Output:

### How do you calculate cumulative distribution in Python?

Use numpy.arange() to Calculate the CDF in Python..

Use numpy.linspace() to Calculate the CDF in Python..

### How do you find the cumulative distribution?

The cumulative distribution function (CDF) of a random variable X is denoted by F(x), and is defined as F(x) = Pr(X ≤ x)..

Pr(X ≤ 1) = 1/6..

Pr(X ≤ 2) = 2/6..

Pr(X ≤ 3) = 3/6..

Pr(X ≤ 4) = 4/6..

Pr(X ≤ 5) = 5/6..

Pr(X ≤ 6) = 6/6 = 1..

### How do you calculate CDF from data?

Given a random variable X, its cdf is the function F(x) = Prob(X <= x) where
the variable x runs through the real numbers. The distribution is called continuous if F(x) is the integral from -infinity to x of a function f called the density function.

Xem thêm:  Hướng dẫn dùng mongodb timestamp trong PHP

### How do you find the empirical cumulative distribution in Python?

The EDF is calculated by ordering all of the unique observations in the data sample and calculating the cumulative probability for each as the number of observations less than or
equal to a given observation divided by the total number of observations. As follows: EDF(x) = number of observations <= x / n.

Thuộc website harveymomstudy.com