PxPlus User Forum

Twitter Twitter Twitter

Author Topic: READ DATA ... FROM ... with quoted fields  (Read 1805 times)

James Zukowski

  • Diamond Member
  • *****
  • Posts: 297
    • View Profile
READ DATA ... FROM ... with quoted fields
« on: April 13, 2020, 11:44:31 AM »
I'd like to have the ability to more easily parse a string into its component fields when any of the fields may be quoted. For example, if A$ contains:

1001,"Smith's, Inc.","123 First Street",Anytown,XX,"13579-2468","John Smith, Sr."

being able to parse that into what humans can interpret as 7 fields would be helpful. Perhaps adding another fileopt would do the trick; e.g.,

Code: [Select]
READ DATA FROM A$,key=# to iol=My_Data
where the key value is cumulative:
1 = Detect quotation marks (")
2 = Detect apostrophes (')
4 = Retain field quotes

If key=0 or not specified, it would operate as it does now.

E.g., with the data above, key=1 would return:
1001
Smith's, Inc.
123 First Street
Anytown
XX
13579-2468
John Smith, Sr.

whereas key=5 would return:
1001
"Smith's, Inc."
"123 First Street"
Anytown
XX
"13579-2468"
"John Smith, Sr."

key=3 or key=7 would allow for either a quote or apostrophe to identify the quoted field, and then look for the matching end-of-field character.

For key=1 or key=2 or key=3, any in-field double-characters would be reduced to single occurrences. Including the retain option (4) would also retain the doubled characters.
James Zukowski
Sr. Developer - J&E

BRAND>SAFWAY
Brand Industrial Services

Mike King

  • Diamond Member
  • *****
  • Posts: 3810
  • Mike King
    • View Profile
    • BBSysco Consulting
Re: READ DATA ... FROM ... with quoted fields
« Reply #1 on: April 13, 2020, 12:36:10 PM »
Why not just use the STR(",") format as in:

  x$="1001,'Smith's, Inc.','123 First Street',Anytown,XX,'13579-2468','John Smith, Sr.'"
  x$=sub(x$,"'",quo)
  print x$

 !
 ! Parse
 !
  dim fields$[1:7]
  read data from x$ to fields${all}:[str(",")]
  dump

NOTE: The lines in blue simply were used to create your test string with quotes.

This will parse the fields for you.  About the only challenge is knowing the number of fields, but if you just use the POS function to count the number of commas you will get a worst case to define the number of elements.  Mind you generally you know how many fields to expect when you design the logic.

Here is the output:

DIM fields$[1:7]
fields$[1]="1001"
fields$[2]="Smiths, Inc."
fields$[3]="123 First Street"
fields$[4]="Anytown"
fields$[5]="XX"
fields$[6]="13579-2468"
fields$[7]="John Smith, Sr."

Mike King
President - BBSysco Consulting
eMail: mike.king@bbsysco.com

James Zukowski

  • Diamond Member
  • *****
  • Posts: 297
    • View Profile
Re: READ DATA ... FROM ... with quoted fields
« Reply #2 on: April 13, 2020, 02:11:49 PM »
Good idea. (Blue lines, though removed the apostrophe from "Smith's, Inc.".)
What would really rock would be dynamic arrays building out from that:

dim Fields$[<star>] ! Because the * doesn't work in the forum
read data from Source$ to Fields${all}:[str(",")]


Then we could:
Fld_Count=dim(read max(Fields$))
to get the actual number of fields present. While we're using data examples here, it could be used for parsing text lines or command string parameters.

Isn't that what wishes are for?
James Zukowski
Sr. Developer - J&E

BRAND>SAFWAY
Brand Industrial Services

Mike King

  • Diamond Member
  • *****
  • Posts: 3810
  • Mike King
    • View Profile
    • BBSysco Consulting
Re: READ DATA ... FROM ... with quoted fields
« Reply #3 on: April 13, 2020, 04:10:47 PM »
Actually I generally just allocate a large array and don't worry much about getting the field count correct since normally the logic knows which column its interested in.

Basically I just do a DIM of 100 or so.

Dynamically allocating the number of fields could have the detrimental effect that if there was a NULL or short record in the input, the array might not be made large enough resulting in a sub-scripting error later in the code.  Forcing allocation of the array assures that will not happen.
Mike King
President - BBSysco Consulting
eMail: mike.king@bbsysco.com

James Zukowski

  • Diamond Member
  • *****
  • Posts: 297
    • View Profile
Re: READ DATA ... FROM ... with quoted fields
« Reply #4 on: April 13, 2020, 05:27:54 PM »
Though with a dynamic array, only the exact number of entries would be created, mirroring:

dim Fld$[ * ]
Fld$[ * ]="1001"
Fld$[ * ]="Smith's, Inc."
Fld$[ * ]="123 First Street"
Fld$[ * ]="Anytown"
Fld$[ * ]="XX"
Fld$[ * ]="13579-2468"
Fld$[ * ]="John Smith, Sr."

And yes, null fields would be valid. In most cases, I will either pre-allocate a large array based on something like the pos(","=X$,1,0), or I'll dim Fld$[ * ] and manually parse the entries.
Just trying to find better, faster alternatives. Also, accounting for the possibility of quoting fields with apostrophes, which helps when passing arguments to Linux with quotation marks.
Thanks for the other tips.  :)
James Zukowski
Sr. Developer - J&E

BRAND>SAFWAY
Brand Industrial Services

Mike King

  • Diamond Member
  • *****
  • Posts: 3810
  • Mike King
    • View Profile
    • BBSysco Consulting
Re: READ DATA ... FROM ... with quoted fields
« Reply #5 on: April 13, 2020, 05:55:45 PM »
Actually the STR(",") would parse the fields with apostrophes just fine.  It was my example where I tried to make it simpler to follow by replacing all apostrophes that was the issue.

Again, you are reading this data from a file and there happens to be an empty record, if you dynamically define the number of elements in the array you will get zero and/or your subsequent code may reference subscripts that weren't defined.  This is why I pre-define with fixed array dimensions.
Mike King
President - BBSysco Consulting
eMail: mike.king@bbsysco.com