Python is one of the easiest programming languages for beginners to learn but is also powerful enough for tasks like AI and machine learning. However, while it may be easy to learn, you’re bound to encounter some random bugs and glitches occasionally.
In this article, we’re talking about the “Error tokenizing data. c” issue, its causes and what you can do to fix the problem.
What causes the “Error tokenizing data. c” issue?
Two main causes cause the issue in most cases:
- Delimiters in input data
- Conflicting headers or columns in the file
Also read: How to fix Fatal error: Python.h: No such file or directory?
How to fix the “Error tokenizing data. c” issue?
Here are two fixes you can try out.
Specify the correct delimiter and headers
As mentioned above, having incorrect delimiters or conflicting headers can very well cause the issue. You can counter this by specifying what the delimiter should be and removing the header entirely as follows.
import panda as pd
dataVariable = pd.read_csv('testFile.csv', sep=';', header=None)
Skipping bad rows
You can also try and skip bad rows that might be causing conflicts with your data. To do so, we’ll be using the error_bad_lines=False flag in our data read command.
dataVariable = pd.read_csv('testFile.csv', on_bad_lines='skip')
The aforementioned command will skip any ‘bad’ or offending lines entirely, making sure the command runs without issues. Remember that if you’re using Pandas version 1.3.0 or lower, use this command.
dataVariable = pd.read_csv("testFile.csv", error_bad_lines=False)