PDA

View Full Version : Csv Reader



marmstrong
23 Oct 2007, 8:40 AM
It seems like I always need more and more info passed to the client. And network latency can be a problem. So I've decided that using data structured as comma delimited is a good way to go. I may also need more than one 'table' in a single request. My server request consists of an xml file with a couple of nodes which contain csv data.

I have seen threads that suggest that this is a bad idea but I get an overall performance gain and haven't run into any problems. Here is the reader that I use.


Ext.extend(Ext.ux.CsvReader, Ext.data.DataReader, {
read : function(response){
var doc = response.responseXML;
if(!doc) {
throw {message: "CsvReader.read: XML Document not available"};
}
return this.readRecords(doc);
},

readRecords : function(doc){

this.xmlData = doc;
var root = doc.documentElement || doc;
var q = Ext.DomQuery;
var recordType = this.recordType, fields = recordType.prototype.fields;
var sid = this.meta.id;
var totalRecords = 0, success = true;
if(this.meta.totalRecords){
totalRecords = q.selectNumber(this.meta.totalRecords, root, 0);
}

if(this.meta.success){
var sv = q.selectValue(this.meta.success, root, true);
success = sv !== false && sv !== 'false';
}
var records = [];
var ns = q.select(this.meta.record, root);
for(var i = 0, len = ns.length; i < len; i++) {
var n = ns[i];
var values = {};
var id = sid ? q.selectValue(sid, n) : undefined;

var str = n.textContent.substring(1,n.textContent.length);
var recs = str.substring(0,str.length-2).split("\n");
for(var x=0, xcnt=recs.length; x<xcnt; x++) {
var r = recs[x].split(",");
for(var j = 0, jlen = fields.length; j < jlen; j++){
var f = fields.items[j];
// var v = q.selectValue(f.mapping || f.name, n, f.defaultValue);
var v = f.convert(r[f.indx]);
// var v = f.convert(r[rcnt]);
values[f.name] = v;
}
var record = new recordType(values, id);
record.node = n;
records[x] = record;
}
}
return {
success : success,
records : records,
totalRecords : totalRecords || records.length
};
}
});

Phunky
23 Oct 2007, 2:29 PM
What real advantage do you get using a CSV in an XML file? Why not have the entire out put as an XML file?

I dont really get the reason behind it?

Troy Wolf
25 Oct 2007, 12:33 PM
What real advantage do you get using a CSV in an XML file? Why not have the entire out put as an XML file?

I dont really get the reason behind it?

marmstrong, I'm with you!

Phunky (and others), the reason is simple efficiency. Specifically in the area of bandwidth. For example, consider that you want to pass a bit of data:

Your XML might look like this:

<?xml version="1.0" encoding="windows-1250"?>
<media>
<movies>
<movie>
<title>Monty Python and the Holy Grail</title>
<release-date>2001</release-date>
<price>10.99</price>
</movie>
<movie>
<title>Monty Python's the Meaning of Life</title>
<release-date>2005</release-date>
<price>6.99</price>
</movie>
<movie>
<title>Monty Python's Life of Brian Collector's Edition</title>
<release-date>2007</release-date>
<price>18.69</price>
</movie>
</movies>
</media>


The same data as CSV might look like this:


"Monty Python and the Holy Grail",2001,10.99
"Monty Python's the Meaning of Life",2005,6.99
"Monty Python's Life of Brian Collector's Edition",2007,18.69


It's a simple matter of bytes. The XML is more bytes than the CSV. Many times, in fact, the XML tags account for over 50% of the bytes sent. It depends on how verbose the XML tags are and the nature of the data of course.

When communicating 3 movies, who cares. When communicating 10,000 rows of data on a regular basis, you suddenly care about network efficiency as well as client consumption time.

Yes, XML is self-describing. That is, any person (and supposedly system) can consume the XML without really knowing anything about the data beforehand. That is the big advantage. However, in my 15 years of application development, most of the time, I am on both ends of the communication. That is, I'm the one developing the web service (data provider) as well as the client (data consumer). If I want to package the data up in CSV, XML, JSON, binary, or a shoebox, I can do that--it's my call. Even when I've needed to share data with outside companies, their developer and I get on the phone and work out a compatible data format--usually a simple CSV.

It's just a simple fact that even after the last two decades and all the XML hype, it's still much easier for developers and command line tools to process CSV.

XML has its place, and is very handy when used correctly. But for large, controlled data-sets, CSV still has a stronghold.

Phunky
25 Oct 2007, 12:51 PM
Now i get the reason :)

mscdex
25 Oct 2007, 7:54 PM
Does this reader take into account special circumstances in a CSV (e.g. a field wrapped in quotes that contains a comma, and fields wrapped in quotes that contain line breaks)?

Rich Kucera
11 Mar 2008, 1:03 PM
this is nice... was just looking for a CSV reader because there's already this CSV rubbish lister on the backend:

class atkExcelCSVRecordList extends atkCustomRecordList ...

Now with the ATK doing all the plumbing on the back end and EXTjs up front, heh :-)

All I do is write little small action blurbs in the node (sort of like a controller).

Animal
12 Mar 2008, 1:35 AM
marmstrong, I'm with you!


The same data as CSV might look like this:


"Monty Python and the Holy Grail",2001,10.99
"Monty Python's the Meaning of Life",2005,6.99
"Monty Python's Life of Brian Collector's Edition",2007,18.69




But the same data formatted for a standard Ext.data.ArrayReader would be



[
["Monty Python and the Holy Grail",2001,10.99],
["Monty Python's the Meaning of Life",2005,6.99],
["Monty Python's Life of Brian Collector's Edition",2007,18.69]
]


So not a huge gain...

Rich Kucera
12 Mar 2008, 6:18 AM
This will grab a CSV from my rubbish lister without modification:


http://service.oriented.architcture.ha.ha/myapp/dispatch.php?atknodetype=Status.prty_invs_survey_vw&atkaction=export&atklevel=1&atkprevlevel=0&atkstackid=12f22ba48fc04&phase=process&export_name=on&delimiter=%2C&enclosure=%22

Also, if someone wants a CSV right away I just hand them a URL.

Someday though I will copy and modify the custom rubbish lister for ArrayReader, something like:



class atkJSArrayRecordList extends atkCustomRecordList
{

var $lfreplace = null;

/**
* Creates a special Recordlist that can be used for exporting to files or to make it printable
* @param atkNode $node The node to use as definition for the columns.
* @param array $recordset The records to render
* @param $compression Compression technique (bzip / gzip)
* @param array $suppressList List of attributes from $node that should be ignored
* @param array $outputparams Key-Value parameters for output. Currently existing:
* filename - the name of the file (without extension .csv)
* @param Boolean $titlerow Should titlerow be rendered or not
* @param Boolean $decode Should data be decoded or not (for exports)
*/
function render(&$node, $recordset, $compression="", $suppressList="", $outputparams=array(), $titlerow=true, $decode=false)
{
$capture_somehow = parent::render($node, $recordset, "", "\"", "\"", "][", "1", $compression, $suppressList, $outputparams, "csv", $titlerow, $decode, ";", $this->lfreplace);

lop_off_extra_bracket_somehow($capture_somehow);

return "[ [".$capture_somehow."] ]";

}
...



Too hard. What I should do is google for a CSV2ARRAY function...

Thanks.
-Rich

p.s.

BTW the backend beast to which I keep referring is here http://atk.achievo.org/atk

I've looked at a lot of ORMs and this is better. Even if you, say, remove it's ability to generate widgets and replace with EXTjs widgets and layout, it still packs more declarative naked objects empowered to do stuff than a regular ORM.

I suppose I could also use the existing export action to get a single record for a detail presentation...hm.

Rich Kucera
12 Mar 2008, 8:11 AM
Does this reader take into account special circumstances in a CSV (e.g. a field wrapped in quotes that contains a comma, and fields wrapped in quotes that contain line breaks)?

I would switch to XML in that case. Prior to that complication, I have simple data that can be transported in CSV.

However, I observe that my rubbish lister currently has some XML transport options...


http://service.oriented.architcture.ha.ha/myapp/dispatch.php?atknodetype=Status.prty_invs_survey_vw&atkaction=xml&atklevel=1&atkprevlevel=0&atkstackid=12f22ba48fc04


Puts everything in attributes though, not elements (e.g. Title, release date are placed in attributes).

Maybe EXTjs can deal with that I don't know...

devnull
12 Mar 2008, 8:55 AM
I really dont mean to derail the discussion, but I am surprised nobody has talked about JSON. If you feel that the JSON vs XML arguement has already happened enough, then ignore my rambling :)
In general, XML is expensive (both computationally and bandwidth-wise) to process. JSON more closely matches the end format (a javascript object), and can very closely match the source format (especially if java), so processing time is minimal. it also does hurt that its fairly compact.

Rich Kucera
12 Mar 2008, 11:32 AM
JSON more closely matches the end format (a javascript object), and can very closely match the source format (especially if java), so processing time is minimal. it also does hurt that its fairly compact.

Fine then. I have added a json handler to the rubbish lister


http://service.oriented.architcture.ha.ha/myapp/dispatch.php?atknodetype=Status.prty_invs_survey_vw&atkaction=json&atklevel=1&atkprevlevel=0&atkstackid=12f22ba48fc04


I must now go to the Ext JS side and see what I can do (will play with the desktop example, my target is to replace a widget happy Oracle Forms application).

dantheman
12 Mar 2008, 2:30 PM
I really dont mean to derail the discussion, but I am surprised nobody has talked about JSON. If you feel that the JSON vs XML arguement has already happened enough, then ignore my rambling :)
In general, XML is expensive (both computationally and bandwidth-wise) to process. JSON more closely matches the end format (a javascript object), and can very closely match the source format (especially if java), so processing time is minimal. it also does hurt that its fairly compact.I've never been an XML fan, much preferring JSON for nearly all applications.

But JSON is clearly more verbose than csv (or the like).
The more columns, the worse the problem. I recently had
to enable client-side mining of a large data set (large for the web)...
The data in JSON: 25mb, in tsv: 1.9mb

--dan

sanjshah
12 Mar 2008, 5:47 PM
Hi,

Would it be possible to show an example of usage?

Thanks!

Troy Wolf
20 Jun 2008, 7:29 AM
I really dont mean to derail the discussion, but I am surprised nobody has talked about JSON. If you feel that the JSON vs XML arguement has already happened enough, then ignore my rambling :)
In general, XML is expensive (both computationally and bandwidth-wise) to process. JSON more closely matches the end format (a javascript object), and can very closely match the source format (especially if java), so processing time is minimal. it also does hurt that its fairly compact.

Agreed -- JSON is more readily machine built and consumed, and, in my opinion, relatively human-readable. The real point of this discussion is that XML and JSON are expensive in that the field labels (or keys) are supplied for every instance of the data. With CSV, you only supply the data and maybe a single header row to describe the keys (field names).

Animal makes a very good point--as is his tendency! ArrayReader can consume a multi-array which is almost no more overhead on the wire than CSV. I guess I was narrowly thinking that the ArrayReader was for reading arrays you build client-side. I had not considered that I could pass an array as a string from the server via an Ajax call to be consumed via an ArrayReader. I will have to test this.

sunilpca
5 Jan 2009, 9:17 PM
It seems like I always need more and more info passed to the client. And network latency can be a problem. So I've decided that using data structured as comma delimited is a good way to go. I may also need more than one 'table' in a single request. My server request consists of an xml file with a couple of nodes which contain csv data.

I have seen threads that suggest that this is a bad idea but I get an overall performance gain and haven't run into any problems. Here is the reader that I use.


Ext.extend(Ext.ux.CsvReader, Ext.data.DataReader, {
read : function(response){
var doc = response.responseXML;
if(!doc) {
throw {message: "CsvReader.read: XML Document not available"};
}
return this.readRecords(doc);
},

readRecords : function(doc){

this.xmlData = doc;
var root = doc.documentElement || doc;
var q = Ext.DomQuery;
var recordType = this.recordType, fields = recordType.prototype.fields;
var sid = this.meta.id;
var totalRecords = 0, success = true;
if(this.meta.totalRecords){
totalRecords = q.selectNumber(this.meta.totalRecords, root, 0);
}

if(this.meta.success){
var sv = q.selectValue(this.meta.success, root, true);
success = sv !== false && sv !== 'false';
}
var records = [];
var ns = q.select(this.meta.record, root);
for(var i = 0, len = ns.length; i < len; i++) {
var n = ns[i];
var values = {};
var id = sid ? q.selectValue(sid, n) : undefined;

var str = n.textContent.substring(1,n.textContent.length);
var recs = str.substring(0,str.length-2).split("\n");
for(var x=0, xcnt=recs.length; x<xcnt; x++) {
var r = recs[x].split(",");
for(var j = 0, jlen = fields.length; j < jlen; j++){
var f = fields.items[j];
// var v = q.selectValue(f.mapping || f.name, n, f.defaultValue);
var v = f.convert(r[f.indx]);
// var v = f.convert(r[rcnt]);
values[f.name] = v;
}
var record = new recordType(values, id);
record.node = n;
records[x] = record;
}
}
return {
success : success,
records : records,
totalRecords : totalRecords || records.length
};
}
});

Please help me on using of CSVReader when I am using above code it is throwing the error "sp undefined". This error due to I am not having CsvReader.js so please give me the url where can I download CSVRearder relavent js files. If possible can you provide the complete source code of Reading a CSV file and display in a grid. Thanks

Animal
6 Jan 2009, 1:37 AM
That's because at the time that code executes, Ext.data.DataReader is not defined.

"sp" is the superclass that you are extending.

sunilpca
6 Jan 2009, 6:31 AM
That's because at the time that code executes, Ext.data.DataReader is not defined.

"sp" is the superclass that you are extending.

Thanks for your quick replay but still I am unable to get the data in grid(csv file upload and display the data in grid). Please help me.

Please see below codes for all js files.

CSVReader.js:
==========


Ext.ux.CsvReader = function(record_type, config) {
meta = {};
Ext.apply(this, config);
if (record_type == null) {
// force this.UseHeader is true
this.UseHeader = true;
}
Ext.ux.CsvReader.superclass.constructor.call(this, meta, record_type);
alert(record_type)
};
Ext.extend(Ext.ux.CsvReader, Ext.data.DataReader, {
read : function(response){
var doc = response.responseXML;
if(!doc) {
throw {message: "CsvReader.read: XML Document not available"};
}
return this.readRecords(doc);
alert("in side doc function")
},

readRecords : function(doc){
this.xmlData = doc;
var root = doc.documentElement || doc;
var q = Ext.DomQuery;
var recordType = this.recordType, fields = recordType.prototype.fields;
var sid = this.meta.id;
var totalRecords = 0, success = true;
if(this.meta.totalRecords){
totalRecords = q.selectNumber(this.meta.totalRecords, root, 0);
}
if(this.meta.success){
var sv = q.selectValue(this.meta.success, root, true);
success = sv !== false && sv !== 'false';
}
var records = [];
var ns = q.select(this.meta.record, root);
for(var i = 0, len = ns.length; i < len; i++) {
var n = ns[i];
var values = {};
var id = sid ? q.selectValue(sid, n) : undefined;
var str = n.textContent.substring(1,n.textContent.length);
var recs = str.substring(0,str.length-2).split("\n");
for(var x=0, xcnt=recs.length; x<xcnt; x++) {
var r = recs[x].split(",");
for(var j = 0, jlen = fields.length; j < jlen; j++){
var f = fields.items[j];
// var v = q.selectValue(f.mapping || f.name, n, f.defaultValue);
var v = f.convert(r[f.indx]);
// var v = f.convert(r[rcnt]);
values[f.name] = v;
}
var record = new recordType(values, id);
record.node = n;
records[x] = record;
}
}
return {
success : success,
records : records,
totalRecords : totalRecords || records.length
};
}});


file-upload.js:
==========

/*
* Ext JS Library 2.2
* Copyright(c) 2006-2008, Ext JS, LLC.
* licensing@extjs.com
*
* http://extjs.com/license
*/


Ext.onReady(function(){
Ext.QuickTips.init();

var msg = function(title, msg){
Ext.Msg.show({
title: title,
msg: msg,
minWidth: 200,
modal: true,
icon: Ext.Msg.INFO,
buttons: Ext.Msg.OK
});
};
new Ext.form.TextField({
inputType: 'file',
fieldLabel: 'File Upload',
name: 'fileupload'
});
var fp = new Ext.FormPanel({
renderTo: 'fi-form',
fileUpload: true,
width: 500,
frame: true,
title: 'File Upload Form',
autoHeight: true,
bodyStyle: 'padding: 10px 10px 0 10px;',
labelWidth: 50,
defaults: {
anchor: '95%',
allowBlank: false,
msgTarget: 'side'
},
items: [{
xtype: 'fileuploadfield',
id: 'form-file',
emptyText: 'Select an image',
fieldLabel: 'Photo',
name: 'multipart/form-data',
//inputType: 'file',
buttonCfg: {
text: '',
iconCls: 'upload-icon'
}
}],
buttons: [{
text: 'Save',
handler: function(){
//alert(document.getElementById("form-file").value)
if(fp.getForm().isValid()){
fp.getForm().submit({
url: 'EmpDetails.jsp',
waitMsg: 'Uploading csv file...',
success: function(fp, action){
//o = Ext.decode(response.responseText);
Ext.Msg.alert('Success', 'Processed file "'+action.result.errors.file+'" on the server', function(btn, text){
if (btn == 'ok'){

//Added for CSV Reader
var proxy = new Ext.data.HttpProxy({url: action.result.errors.file});
var CSVEscapeChar = '\\'; // the special character used to quote other special characters
var CSVFieldDelim = ','; // the field delimiter
var CSVTextQualifier = '"'; // string delimiter (field and row delimiters ignored inside strings)
var CSVRowDelim = '\n'; // the record delimiter
var DataHasHeader = true; // is there a header line?
var ParserDOSKludge = true; // treat \r\n as \n
var recordMetaType = Ext.data.Record.create([
{name : 'EMP_ID', type: 'int' },
{name : 'EMP_NAME', type: 'int' },
{name : 'EMP_AGE', type: 'int' }
]);
var reader = new Ext.ux.CsvReader(recordMetaType, {
EscapeChar : CSVEscapeChar,
FieldDelim : CSVFieldDelim,
TextQualifier : CSVTextQualifier,
RowDelim : CSVRowDelim,
UseHeader : DataHasHeader,
//DOSKludge : ParserDOSKludge
});
var primaryDataStore = new Ext.data.Store({
proxy: proxy,
reader: reader,
listener: {
load: function(){
alert("loading....."+proxy);
}
}
});
function primaryDataStoreLoadedCallback(store, records, options) {
// create the Data Store
var dataStore = new Ext.data.Store({
proxy: new Ext.data.PagingMemoryProxy(records),
reader: new Ag.data.RecordReader({}, store.reader.recordType),
remoteSort: true
});
if (columnModelMetaData == null) {
columnModelMetaData = store.reader.columnModelData;
}
var columnModel = new Ext.grid.ColumnModel(columnModelMetaData);
columnModel.defaultSortable = true;
// create the grid
var grid = new Ext.grid.GridPanel('example-grid', {
ds: dataStore,
cm: columnModel,
selModel: new Ext.grid.RowSelectionModel({singleSelect:true}),
enableColLock:false,
loadMask: true
});
// make the grid resizable, do before render for better performance
var rz = new Ext.Resizable('example-grid', {
wrap:true,
minHeight:240,
pinned:true,
handles: 's se e'
});
rz.on('resize', grid.autoSize, grid);
grid.render();
var gridFoot = grid.getView().getFooterPanel(true);
// add a paging toolbar to the grid's footer
var paging = new Ext.PagingToolbar(gridFoot, dataStore, {
pageSize: TablePageSize,
displayInfo: true,
displayMsg: 'Displaying lines {0} - {1} of {2}',
emptyMsg: "No lines to display"
});
// trigger the data store load
dataStore.load({params:{start:0, limit:TablePageSize}});
}

//End
}
});
}
});
}
}
},{
text: 'Reset',
handler: function(){
fp.getForm().reset();
}
}]
});
function csvReader(fileName){

}
});


Thanks

alexbanda1982
9 Nov 2010, 12:07 PM
Did you reach that...?? Help!