Mission
The goal of this project is to create a java library that contains a parser of flat files and csv files. The library should be simple to use and possible to extend.
Existing features
- Support for flat files with fixed positions.
- Support for CSV files.
- The schema can be expressed with xml notation or created directly within the java code.
- The parser can either produce a Document class, representing the content of the file, or you can choose to receive events for each line that has been successfully parsed.
- Can handle huge files without loading everything into memory.
- The output Document class contains a list of lines which contains a list of cells.
- The Document class can be transformed into a Java object (via reflection) if the schema is carefully written.
- It is also possible to produce java objects directly from the parser.
- It is possible convert a list of java objects into a file according to a schema if the schema is carefully written.
- The Document class can be built from a xml file (according to an internal xml schema).
- The input and outputs are given by java.io.Reader and java.io.Writer which means that it is not necessarily files that are parsed or generated.
- The file parsing schema contains information about how to parse each cell regarding data type and syntax.
- Parsing errors can either be handled by exceptions thrown at first error or the errors can be collected during parsing to be able to deal with them later.
- JUnit tests for most classes within the library.
- Support for localisation.
Java Schema Parser
The javadoc within the package contains more comprehensive documentation regarding the classes mentioned below.The JSaPar package is a java library that provides a parser for flat and CSV (Comma Separated Values) files. The concept is that a schema class denotes the way a file should be parsed or written. The schema class can be built by specifying a xml-document or it can be constructed programmatically by using java code. The output of the parser is usually a org.jsapar.Document object that contains a list of org.jsapar.Line objects which contains a list of org.jsapar.Cell objects.
Supported file formats:
- Fixed width - Also refered to as flat file. Each cell is described only by its positions within the line. The type of the line is denoted by its position within the file.
- Fixed width contol value - The same as Fixed width above except that each line type is denoted by a control value in the leading characters of each line.
- CSV - (Comma Separated Values) Each cell is limited by a separator character (or characters). The type of the line is denoted by its position within the file.
- CSV contol value - The same as CSV above except that each line type is denoted by a control value in the leading cell of each line.
1 comentario:
Cual es la intencion de este blog ? Me gustaria aprender Seam de JBoss, pero estas son meras referencias. Algún consejo ?
Publicar un comentario