Page 1 of 1

I have this code to split a CSV file to others depend on the number of the column, what the code that I should add to re

Posted: Thu Jun 02, 2022 8:21 am
by answerhappygod
I have this code to split a CSV file to others depend on
the number of the column, what the code that I should add to remove
Duplicate rows, and replace any empty value with NULL word.
Category,Title,Sales
"Books","Harry Potter",1441556
"Books","Lord of the Rings",14251154
"Series", "Breaking Bad",6246234
"Books","The Alchemist",12562166
"Books","The Alchemist",12562166
"Movie","",1573437
The code:
import csv
#Creates empty set - this will be used to store the values that
have already been used
filelist = set()
#Opens the large csv file in "read" mode
with open('//directory/largefile', 'r') as csvfile:
#Read the first row of the large file and store
the whole row as a string (headerstring)
read_rows = csv.reader(csvfile)
headerrow = next(read_rows)
headerstring=','.join(headerrow)
for row in read_rows:
#Store the whole row as a string
(rowstring)
rowstring=','.join(row)
#Defines filename as the first entry
in the row - This could be made dynamic so that the user inputs a
column name to use
filename = (row[0])
#This basically makes sure it is not
looking at the header row.
if filename != "Category":
#If the filename is
not in the filelist set, add it to the list and create new csv file
with header row.
if filename not in
filelist:

filelist.add(filename)
with
open('//directory/subfiles/' +str(filename)+'.csv','a') as f:

f.write(headerstring)

f.write("\n")

f.close()
#If the filename is in
the filelist set, append the current row to the existing csv file.

else:
with
open('//directory/subfiles/' +str(filename)+'.csv','a') as f:

f.write(rowstring)

f.write("\n")

f.close()