Tuesday, October 12, 2010

Read CSV File Using Regular Expressions in Java

Example 2:

Reading CSV File and Storing it in Database with Regular Expressions

Step 1:

File CsvFile =new File("File Path");
FileReader inpFile = new FileReader(CsvFile);
BufferedReader inpReader = new BufferedReader(inpFile);

String inpLine;
Step 2:

//Reading Line By Line
while ((inpLine = inpReader.readLine()) != null)
{
  List result = parseCSV(inpLine, ",");
  String Fields[] = (String[]) result.toArray(new String[0]);
  DoInsertIntoDB(Feilds);
}
Step 3:

//Using Regular Expression
public List parseCSV(String csv, String delim)
{
  final Pattern NEXT_COLUMN = nextColumnRegex(delim);
  final List strings = new ArrayList();
  final Matcher matcher = NEXT_COLUMN.matcher(csv);
  while (!matcher.hitEnd() && matcher.find()) 
  {
    String match = matcher.group(1);
    if (match.matches(quoted)) 
    {
      match = match.substring(1, match.length() - 1);
    }
    match = match.replaceAll("\"\"", "\"");
    strings.add(match);
  }
return strings;
}
private Pattern nextColumnRegex(String comma) 
{
  String unquoted = "(:?[^\"" + comma + "]|\"\")*";
  String ending = "(:?" + comma + "|$)";
return Pattern.compile('(' + quoted + '|' + unquoted + ')'
                                               + ending);
}
Step 4:

//Inserting Into Database
public void DoInsertIntoDB(String[] inpFeilds)
{
  String Query="insert into emp(id,name)values(?,?)";
  PreparedStatement InsST=connetion.prepareStatement(Query);
  for (int i = 0; i < inpFeilds.length; i++)
  {
    inpFeilds[i] = inpFeilds[i].replace("\"", "");
  }

  InsST.setString(1, inpFeilds[0]);
  InsST.setString(1, inpFeilds[2]);
  InsST.executeUpdate();
}

//This is what you need to do when 
                 reading .csv file.

Reading CSV File using StringTokenizer

There are two ways to read a .csv file and store it in the database.


1)Reading the file line by line and split that line using
StringTokenizer and insert into database.

2)using regular expressions in Java .

Example 1:

Step 1:
File CsvFile =new File("File Path");
FileReader inpFile = new FileReader(CsvFile);
BufferedReader inpReader = new BufferedReader(inpFile);
String inpLine; 

Step 2:
//Reading file line by line
while ((inpLine = inpReader.readLine()) != null) 
{
  String[] Feilds = inpLine.split(",\\s*"); //Specify delimiter
  DoInsertIntoDB(Feilds); //Calling method to insert in Database
}

Step 3:
//Inserting In Database
public void DoInsertIntoDB(String[] inpFeilds)
{
  String Query="insert into emp(id,name)values(?,?)";  
  PreparedStatement InsST=connection.prepareStatement(Query);
  for (int i = 0; i < inpFeilds.length; i++) 
  {
     inpFeilds[i] = inpFeilds[i].replace("\"", "");
  }
  InsST.setString(1, inpFeilds[0]);
  InsST.setString(1, inpFeilds[2]);
  InsST.executeUpdate();
}

Explanation:
In this example i have read the csv file line by line and 
     split the line using StringTokenizer
  In StringTokenizer i have specified , as my delimeter.
      so you can specify whatever delimiter you want.
Disadvantages:
1) If the data in the csv file like "author1,author2" 
     then the above example will read the line and split it 
       into two strings and inserts it into two different columns.
         but you need these full string to be in single column.
           so you can achieve this using regular expression.