DataFrame#
Constructor#
| Two-dimensional, size-mutable, potentially heterogeneous tabular data. |
Attributes and underlying data#
Axes
The index (row labels) of the DataFrame. | |
The column labels of the DataFrame. |
Return the dtypes in the DataFrame. | |
| Print a concise summary of a DataFrame. |
| Return a subset of the DataFrame's columns based on the column dtypes. |
Return a Numpy representation of the DataFrame. | |
Return a list representing the axes of the DataFrame. | |
Return an int representing the number of axes / array dimensions. | |
Return an int representing the number of elements in this object. | |
Return a tuple representing the dimensionality of the DataFrame. | |
| Return the memory usage of each column in bytes. |
Indicator whether Series/DataFrame is empty. | |
| Return a new object with updated flags. |
Conversion#
| Cast a pandas object to a specified dtype |
| Convert columns to the best possible dtypes using dtypes supporting |
| Attempt to infer better dtypes for object columns. |
| Make a copy of this object's indices and data. |
(DEPRECATED) Return the bool of a single element Series or DataFrame. | |
| Convert the DataFrame to a NumPy array. |
Indexing, iteration#
| Return the first n rows. |
Access a single value for a row/column label pair. | |
Access a single value for a row/column pair by integer position. | |
Access a group of rows and columns by label(s) or a boolean array. | |
(DEPRECATED) Purely integer-location based indexing for selection by position. | |
| Insert column into DataFrame at specified location. |
Iterate over info axis. | |
Iterate over (column name, Series) pairs. | |
Get the 'info axis' (see Indexing for more). | |
Iterate over DataFrame rows as (index, Series) pairs. | |
| Iterate over DataFrame rows as namedtuples. |
| Return item and drop from frame. |
| Return the last n rows. |
| Return cross-section from the Series/DataFrame. |
| Get item from object for given key (ex: DataFrame column). |
| Whether each element in the DataFrame is contained in values. |
| Replace values where the condition is False. |
| Replace values where the condition is True. |
| Query the columns of a DataFrame with a boolean expression. |
For more information on .at
, .iat
, .loc
, and .iloc
, see the indexing documentation.
Binary operator functions#
| Get Addition of DataFrame and other, column-wise. |
| Get Addition of dataframe and other, element-wise (binary operator add). |
| Get Subtraction of dataframe and other, element-wise (binary operator sub). |
| Get Multiplication of dataframe and other, element-wise (binary operator mul). |
| Get Floating division of dataframe and other, element-wise (binary operator truediv). |
| Get Floating division of dataframe and other, element-wise (binary operator truediv). |
| Get Integer division of dataframe and other, element-wise (binary operator floordiv). |
| Get Modulo of dataframe and other, element-wise (binary operator mod). |
| Get Exponential power of dataframe and other, element-wise (binary operator pow). |
| Compute the matrix multiplication between the DataFrame and other. |
| Get Addition of dataframe and other, element-wise (binary operator radd). |
| Get Subtraction of dataframe and other, element-wise (binary operator rsub). |
| Get Multiplication of dataframe and other, element-wise (binary operator rmul). |
| Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
| Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
| Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
| Get Modulo of dataframe and other, element-wise (binary operator rmod). |
| Get Exponential power of dataframe and other, element-wise (binary operator rpow). |
| Get Less than of dataframe and other, element-wise (binary operator lt). |
| Get Greater than of dataframe and other, element-wise (binary operator gt). |
| Get Less than or equal to of dataframe and other, element-wise (binary operator le). |
| Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). |
| Get Not equal to of dataframe and other, element-wise (binary operator ne). |
| Get Equal to of dataframe and other, element-wise (binary operator eq). |
| Perform column-wise combine with another DataFrame. |
| Update null elements with value in the same location in other. |
Function application, GroupBy & window#
| Apply a function along an axis of the DataFrame. |
| Apply a function to a Dataframe elementwise. |
| (DEPRECATED) Apply a function to a Dataframe elementwise. |
| Apply chainable functions that expect Series or DataFrames. |
| Aggregate using one or more operations over the specified axis. |
| Aggregate using one or more operations over the specified axis. |
| Call |
| Group DataFrame using a mapper or by a Series of columns. |
| Provide rolling window calculations. |
| Provide expanding window calculations. |
| Provide exponentially weighted (EW) calculations. |
Computations / descriptive stats#
Return a Series/DataFrame with absolute numeric value of each element. | |
| Return whether all elements are True, potentially over an axis. |
| Return whether any element is True, potentially over an axis. |
| Trim values at input threshold(s). |
| Compute pairwise correlation of columns, excluding NA/null values. |
| Compute pairwise correlation. |
| Count non-NA cells for each column or row. |
| Compute pairwise covariance of columns, excluding NA/null values. |
| Return cumulative maximum over a DataFrame or Series axis. |
| Return cumulative minimum over a DataFrame or Series axis. |
| Return cumulative product over a DataFrame or Series axis. |
| Return cumulative sum over a DataFrame or Series axis. |
| Generate descriptive statistics. |
| First discrete difference of element. |
| Evaluate a string describing operations on DataFrame columns. |
| Return unbiased kurtosis over requested axis. |
| Return unbiased kurtosis over requested axis. |
| Return the maximum of the values over the requested axis. |
| Return the mean of the values over the requested axis. |
| Return the median of the values over the requested axis. |
| Return the minimum of the values over the requested axis. |
| Get the mode(s) of each element along the selected axis. |
| Fractional change between the current and a prior element. |
| Return the product of the values over the requested axis. |
| Return the product of the values over the requested axis. |
| Return values at the given quantile over requested axis. |
| Compute numerical data ranks (1 through n) along axis. |
| Round a DataFrame to a variable number of decimal places. |
| Return unbiased standard error of the mean over requested axis. |
| Return unbiased skew over requested axis. |
| Return the sum of the values over the requested axis. |
| Return sample standard deviation over requested axis. |
| Return unbiased variance over requested axis. |
| Count number of distinct elements in specified axis. |
| Return a Series containing the frequency of each distinct row in the Dataframe. |
Reindexing / selection / label manipulation#
| Prefix labels with string prefix. |
| Suffix labels with string suffix. |
| Align two objects on their axes with the specified join method. |
| Select values at particular time of day (e.g., 9:30AM). |
| Select values between particular times of the day (e.g., 9:00-9:30 AM). |
| Drop specified labels from rows or columns. |
| Return DataFrame with duplicate rows removed. |
| Return boolean Series denoting duplicate rows. |
| Test whether two objects contain the same elements. |
| Subset the dataframe rows or columns according to the specified index labels. |
| (DEPRECATED) Select initial periods of time series data based on a date offset. |
| Return the first n rows. |
| Return index of first occurrence of maximum over requested axis. |
| Return index of first occurrence of minimum over requested axis. |
| (DEPRECATED) Select final periods of time series data based on a date offset. |
| Conform DataFrame to new index with optional filling logic. |
| Return an object with matching indices as other object. |
| Rename columns or index labels. |
| Set the name of the axis for the index or columns. |
| Reset the index, or a level of it. |
| Return a random sample of items from an axis of object. |
| Assign desired index to given axis. |
| Set the DataFrame index using existing columns. |
| Return the last n rows. |
| Return the elements in the given positional indices along an axis. |
| Truncate a Series or DataFrame before and after some index value. |
Missing data handling#
| (DEPRECATED) Fill NA/NaN values by using the next valid observation to fill the gap. |
| Fill NA/NaN values by using the next valid observation to fill the gap. |
| Remove missing values. |
| Fill NA/NaN values by propagating the last valid observation to next valid. |
| Fill NA/NaN values using the specified method. |
| Fill NaN values using an interpolation method. |
Detect missing values. | |
DataFrame.isnull is an alias for DataFrame.isna. | |
Detect existing (non-missing) values. | |
DataFrame.notnull is an alias for DataFrame.notna. | |
| (DEPRECATED) Fill NA/NaN values by propagating the last valid observation to next valid. |
| Replace values given in to_replace with value. |
Reshaping, sorting, transposing#
| Return Series/DataFrame with requested index / column level(s) removed. |
| Return reshaped DataFrame organized by given index / column values. |
| Create a spreadsheet-style pivot table as a DataFrame. |
| Rearrange index levels using input order. |
| Sort by the values along either axis. |
| Sort object by labels (along an axis). |
| Return the first n rows ordered by columns in descending order. |
| Return the first n rows ordered by columns in ascending order. |
| Swap levels i and j in a |
| Stack the prescribed level(s) from columns to index. |
| Pivot a level of the (necessarily hierarchical) index labels. |
| (DEPRECATED) Interchange axes and swap values axes appropriately. |
| Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. |
| Transform each element of a list-like to a row, replicating index values. |
| Squeeze 1 dimensional axis objects into scalars. |
Return an xarray object from the pandas object. | |
The transpose of the DataFrame. | |
| Transpose index and columns. |
Combining / comparing / joining / merging#
| Assign new columns to a DataFrame. |
| Compare to another DataFrame and show the differences. |
| Join columns of another DataFrame. |
| Merge DataFrame or named Series objects with a database-style join. |
| Modify in place using non-NA values from another DataFrame. |
Flags#
Flags refer to attributes of the pandas object. Properties of the dataset (like the date is was recorded, the URL it was accessed from, etc.) should be stored in DataFrame.attrs
.
| Flags that apply to pandas objects. |
Metadata#
DataFrame.attrs
is a dictionary for storing global metadata for this DataFrame.
Warning
DataFrame.attrs
is considered experimental and may change without warning.
Dictionary of global attributes of this dataset. |
Plotting#
DataFrame.plot
is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>
.
| DataFrame plotting accessor and method |
| Draw a stacked area plot. |
| Vertical bar plot. |
| Make a horizontal bar plot. |
| Make a box plot of the DataFrame columns. |
| Generate Kernel Density Estimate plot using Gaussian kernels. |
| Generate a hexagonal binning plot. |
| Draw one histogram of the DataFrame's columns. |
| Generate Kernel Density Estimate plot using Gaussian kernels. |
| Plot Series or DataFrame as lines. |
| Generate a pie plot. |
| Create a scatter plot with varying marker point size and color. |
| Make a box plot from DataFrame columns. |
| Make a histogram of the DataFrame's columns. |
Sparse accessor#
Sparse-dtype specific methods and attributes are provided under the DataFrame.sparse
accessor.
Ratio of non-sparse points to total (dense) data points. |
| Create a new DataFrame from a scipy sparse matrix. |
Return the contents of the frame as a sparse SciPy COO matrix. | |
Convert a DataFrame with sparse values to dense. |
Serialization / IO / conversion#
| Construct DataFrame from dict of array-like or dicts. |
| Convert structured or record ndarray to DataFrame. |
| Write a DataFrame to the ORC format. |
| Write a DataFrame to the binary parquet format. |
| Pickle (serialize) object to file. |
| Write object to a comma-separated values (csv) file. |
| Write the contained data to an HDF5 file using HDFStore. |
| Write records stored in a DataFrame to a SQL database. |
| Convert the DataFrame to a dictionary. |
| Write object to an Excel sheet. |
| Convert the object to a JSON string. |
| Render a DataFrame as an HTML table. |
| Write a DataFrame to the binary Feather format. |
| Render object to a LaTeX tabular, longtable, or nested table. |
| Export DataFrame object to Stata dta format. |
| (DEPRECATED) Write a DataFrame to a Google BigQuery table. |
| Convert DataFrame to a NumPy record array. |
| Render a DataFrame to a console-friendly tabular output. |
| Copy object to the system clipboard. |
| Print DataFrame in Markdown-friendly format. |
Returns a Styler object. | |
| Return the dataframe interchange object implementing the interchange protocol. |