PxPlus User Forum

Twitter Twitter Twitter

Author Topic: performance of select record  (Read 1586 times)

Thomas Bock

  • Diamond Member
  • *****
  • Posts: 179
    • View Profile
performance of select record
« on: August 24, 2023, 10:20:36 AM »
So far we don't use "select record" a lot. I did some tests and found that it is slower than an old style read loop. I'm wondering why. As it knows all the keys I'm interested in including all conditions it should be much faster than any programmed logic.
I tested with a file constisting of 600,000 records and read all records from then second to the last but one using the primary key. All runs with "select record" needed 3 seconds longer than the old style read loop.

In my opinion it should be much faster.

Jim Morton

  • Member
  • **
  • Posts: 12
    • View Profile
Re: performance of select record
« Reply #1 on: August 24, 2023, 11:16:03 AM »
In your read loop were you using "read record" or an iolist?
Just making sure you are comparing apples to apples.
I have noticed that read/select record into a string template is a bit slower than read/select using an iolist. 

PxPlus

  • Administrator
  • Diamond Member
  • *****
  • Posts: 1091
    • View Profile
Re: performance of select record
« Reply #2 on: August 24, 2023, 01:18:58 PM »
Not certain how you ran your test but I just did the following:

-}list
0010 OPEN (1,IOL=*)"bigfile"
0020 LET rd=0,sel=0
0030 LET x=TMR(0)
0040 READ (1,END=0070)
0050 rd++
0060 GOTO 0040
0070 PRINT TMR(0),rd
0080 CLOSE (1)
0090 LET x=TMR(0)
0100 SELECT * FROM "bigfile"
0110 sel++
0120 NEXT RECORD
0130 PRINT TMR(0),sel
-}run
 1.241 114697
 1.089 114697
-}run
 1.245 114697
 1.088 114697
-}run
 1.237 114697
 1.087 114697


As you can see the SELECT was faster even considering the SELECT timing included the file open/close..
« Last Edit: August 24, 2023, 01:32:28 PM by PxPlus »

Stéphane Devouard

  • Diamond Member
  • *****
  • Posts: 122
  • PxPlus guru with skills in PHP, JS, C#, Java
    • View Profile
    • Stéphane's Web Resume
Re: performance of select record
« Reply #3 on: August 28, 2023, 07:16:53 AM »
Thomas

As it knows all the keys I'm interested in including all conditions it should be much faster than any programmed logic.

This is an incorrect assumption. ProvideX SELECT is *not* SQL SELECT

You are right about SQL SELECT where the SQL engine chooses the best possible path to get to the data based on various criteria including the WHERE clause -- and it will be fast as long as you have indexed the columns on which you're selecting.

ProvideX SELECT is some syntactic sugar which allows you to easily code a file read loop
You can use a file channel or a file name. If you use the latter, the file will be closed after the last NEXT RECORD iteration
However :
- You must select yourself the best KNO to access the data. ProvideX won't do it for you
- If you use a file channel, the SELECT will start the loop at the current cursor position in the file, unless you use the BEGIN clause to specify the first key to read (this is the same as the Cobol START statement)
- Adding a WHERE allows ProvideX to execute the logic inside the SELECT/NEXT RECORD loop for only the records that match the conditions. Basically the IFs are done at the C-level therefore they are faster. If you don't specify a WHERE then the loop will be executed for all the record between the BEGIN (if any) and END (if any) clauses

Hope this helps
Stéphane Devouard
Portfolio | Work

PxPlus

  • Administrator
  • Diamond Member
  • *****
  • Posts: 1091
    • View Profile
Re: performance of select record
« Reply #4 on: September 05, 2023, 10:31:46 AM »
Just a side note.

As Stephane stated the select does not attempt to determine the best key to use.  It was designed for speed and therefore assumes the programmer will pre-identify which key would be optimal.

Having the system analyze a request such as a SELECT to determine the key to use is basically a waste of system resources.  The programmer should know up front which key would be best so why have the system try and figure this out every time it hits the SELECT statement?

Its fine for a generic query system SELECT to figure out which key to use, but for a program it makes more sense for the programmer to state which key the system should use to assure consistent results.

To simplify this the system supports named keys so that the file/table can have its keys altered and the programmer will not need to adapt the code to different key numbers.  For example if you want the access by name, then name the key something like "ByName" and opposed to key number 'nnn'.