Polars - Read CSV into DataFrame

Polars, a powerful DataFrame library, provides an intuitive and efficient way to read CSV files into DataFrames. If you’re familiar with pandas, you’ll find Polars’ methods easy to use. In this article, we’ll explore how to read CSV files using Polars and cover some useful features.

Reading CSV Files

To read a CSV file into a DataFrame, use the polars.read_csv() function. Here’s how:

import polars as pl

# Read a CSV file into a DataFrame
df = pl.read_csv('my_data.csv')

# Display the first few rows
print(df.head())

By default, Polars assumes that the first row of the CSV file contains column names. If your CSV file doesn’t have a header row, set has_header=False:

df = pl.read_csv('my_data.csv', has_header=False)

Additional Options

Custom Separators: By default, Polars assumes that the separator is a comma (,). You can specify a different separator using the separator argument:

df = pl.read_csv('my_data.csv', separator=';')

Column Selection: You can select specific columns by passing a list of column names or indices:

df = pl.read_csv('my_data.csv', columns=['name', 'age'])

Schema Inference: Polars infers the schema (data types) of columns automatically. If you want to specify the schema manually, use the dtypes argument:

df = pl.read_csv('my_data.csv', dtypes={'age': pl.Int32})

Null Values: Specify values to interpret as null (e.g., "NA", "null"):

df = pl.read_csv('my_data.csv', null_values=['NA', 'null'])

Conclusion

Polars simplifies CSV file handling with methods like read_csv(). Whether you’re analyzing data or building machine learning models, Polars’ intuitive interface makes it a great choice for data manipulation.

Happy coding with Polars! 🚀📊


Publish Date: 2024-05-08, Update Date: 2024-05-08