My quest for efficiency

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

My quest for efficiency

Fernando Cabral
Hi

I've found a file whose text has been obfuscated by subtracting 11 from
every byte. Now I want to bring it back to regular text. To do this I have
to add 11 to each byte read from that file. Now, I have tried several ways
to do it, and they all seemed every inefficient to me. Two examples follow











*j = 0For i = 0 To Len(RawText)str &= Chr(CByte(Asc(RawText, i) + 11))  '
either this or the following'Mid(Rawtext, i, 1) = Chr(CByte(Asc(RawText, i)
+ 11))Inc jIf j = 100000 Then   Print i; Now   j = 0EndifNext*

In the first option (uncommented) I am building a new string byte by byte.
In the second option (commented) I am replacing each character in place.
I expected the second option to be way faster, especially because there is
no need for the string to be reallocated. Nevertheless, it showed to be a
snail.
The first option, in spite of the fact that it grows slower and slower as
the string grows, is still way faster than the second option.


To me it does not make sense. Does it for you?
Also, is there a faster way to do this?

--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

gambas-user mailing list
Le 15/07/2017 à 16:08, Fernando Cabral a écrit :

> Hi
>
> I've found a file whose text has been obfuscated by subtracting 11 from
> every byte. Now I want to bring it back to regular text. To do this I have
> to add 11 to each byte read from that file. Now, I have tried several ways
> to do it, and they all seemed every inefficient to me. Two examples follow
>
> *j = 0For i = 0 To Len(RawText)str &= Chr(CByte(Asc(RawText, i) + 11))  '
> either this or the following'Mid(Rawtext, i, 1) = Chr(CByte(Asc(RawText, i)
> + 11))Inc jIf j = 100000 Then   Print i; Now   j = 0EndifNext*
>
> In the first option (uncommented) I am building a new string byte by byte.
> In the second option (commented) I am replacing each character in place.
> I expected the second option to be way faster, especially because there is
> no need for the string to be reallocated. Nevertheless, it showed to be a
> snail.
> The first option, in spite of the fact that it grows slower and slower as
> the string grows, is still way faster than the second option.
>
>
> To me it does not make sense. Does it for you?
> Also, is there a faster way to do this?
>

Strings in Gambas are immutable. The second method does not act "in
place", because this syntactic sugar actually creates a new string by
concatenating the left part, the new character, and the right part.

Use a byte array instead of a string (if you are sure that you are
dealing with ASCII of course), or a string buffer (OPEN STRING syntax
see the wiki for the details)

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

Fernando Cabral
In reply to this post by Fernando Cabral
Well, after 5 hours the most efficient version is still running. Only 1/5
of the file has been processed. The less efficient version has only
processed 1 MB, or 1/ 42 of the file.

So I decided to write a C program to do the same task. Since I have not
been using C in the last 20 years, I did not try any fancy thing. I know C
has to be more efficient, so I expected to find find, perhaps, 10 minutes,
5 minutes. Not so. To my surprise, the program bellow did the whole thing
in ONE SECOND!

I found this to be quite inexpected.

















*#include <stdio.h>int main(void){    FILE *fp;    int c;    fp =
fopen("/home/fernando/temp/deah001.dhn", "r");    while((c = fgetc(fp)) !=
EOF) {            putchar(c + 11);        }    fclose(fp);    return 0;}*

I am sure there is a way to do this efficiently in Gambas.Certainly not in
1 second, as it happened here, but perhaps in 5 or 10 minutes instead of
the several hours it is now taking.

- fernando

2017-07-15 11:08 GMT-03:00 Fernando Cabral <[hidden email]>:

> Hi
>
> I've found a file whose text has been obfuscated by subtracting 11 from
> every byte. Now I want to bring it back to regular text. To do this I have
> to add 11 to each byte read from that file. Now, I have tried several ways
> to do it, and they all seemed every inefficient to me. Two examples follow
>
>
>
>
>
>
>
>
>
>
>
> *j = 0For i = 0 To Len(RawText)str &= Chr(CByte(Asc(RawText, i) + 11))  '
> either this or the following'Mid(Rawtext, i, 1) = Chr(CByte(Asc(RawText, i)
> + 11))Inc jIf j = 100000 Then   Print i; Now   j = 0EndifNext*
>
> In the first option (uncommented) I am building a new string byte by byte.
> In the second option (commented) I am replacing each character in place.
> I expected the second option to be way faster, especially because there is
> no need for the string to be reallocated. Nevertheless, it showed to be a
> snail.
> The first option, in spite of the fact that it grows slower and slower as
> the string grows, is still way faster than the second option.
>
>
> To me it does not make sense. Does it for you?
> Also, is there a faster way to do this?
>
> --
> Fernando Cabral
> Blogue: http://fernandocabral.org
> Twitter: http://twitter.com/fjcabral
> e-mail: [hidden email]
> Facebook: [hidden email]
> Telegram: +55 (37) 99988-8868 <(37)%2099988-8868>
> Wickr ID: fernandocabral
> WhatsApp: +55 (37) 99988-8868 <(37)%2099988-8868>
> Skype:  fernandojosecabral
> Telefone fixo: +55 (37) 3521-2183 <(37)%203521-2183>
> Telefone celular: +55 (37) 99988-8868 <(37)%2099988-8868>
>
> Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
> nenhum político ou cientista poderá se gabar de nada.
>
>


--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

Tony Morehen
Did you try Benoit's suggestion:

Public Sub Main()

   Dim sIn as String
   Dim sOut as String

   sIn = File.Load("/home/fernando/temp/deah001.dhn")
   sOut = Add11(sIn)
   File.Save("/home/fernando/temp/deah001.11Added.dhn", sOut)

End

Public Sub Add11(InputString as String) as String
   Dim bArray As Byte[]
   Dim String11 As String
   Dim i As Integer

   bArray = Byte[].FromString(InputString)
   For i = 0 To bArray.Max
     bArray[i] += 11
   Next
  Return bArray.ToString
End


On 2017-07-15 01:36 PM, Fernando Cabral wrote:

> Well, after 5 hours the most efficient version is still running. Only 1/5
> of the file has been processed. The less efficient version has only
> processed 1 MB, or 1/ 42 of the file.
>
> So I decided to write a C program to do the same task. Since I have not
> been using C in the last 20 years, I did not try any fancy thing. I know C
> has to be more efficient, so I expected to find find, perhaps, 10 minutes,
> 5 minutes. Not so. To my surprise, the program bellow did the whole thing
> in ONE SECOND!
>
> I found this to be quite inexpected.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *#include <stdio.h>int main(void){    FILE *fp;    int c;    fp =
> fopen("/home/fernando/temp/deah001.dhn", "r");    while((c = fgetc(fp)) !=
> EOF) {            putchar(c + 11);        }    fclose(fp);    return 0;}*
>
> I am sure there is a way to do this efficiently in Gambas.Certainly not in
> 1 second, as it happened here, but perhaps in 5 or 10 minutes instead of
> the several hours it is now taking.
>
> - fernando
>
> 2017-07-15 11:08 GMT-03:00 Fernando Cabral <[hidden email]>:
>
>> Hi
>>
>> I've found a file whose text has been obfuscated by subtracting 11 from
>> every byte. Now I want to bring it back to regular text. To do this I have
>> to add 11 to each byte read from that file. Now, I have tried several ways
>> to do it, and they all seemed every inefficient to me. Two examples follow
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *j = 0For i = 0 To Len(RawText)str &= Chr(CByte(Asc(RawText, i) + 11))  '
>> either this or the following'Mid(Rawtext, i, 1) = Chr(CByte(Asc(RawText, i)
>> + 11))Inc jIf j = 100000 Then   Print i; Now   j = 0EndifNext*
>>
>> In the first option (uncommented) I am building a new string byte by byte.
>> In the second option (commented) I am replacing each character in place.
>> I expected the second option to be way faster, especially because there is
>> no need for the string to be reallocated. Nevertheless, it showed to be a
>> snail.
>> The first option, in spite of the fact that it grows slower and slower as
>> the string grows, is still way faster than the second option.
>>
>>
>> To me it does not make sense. Does it for you?
>> Also, is there a faster way to do this?
>>
>> --
>> Fernando Cabral
>> Blogue: http://fernandocabral.org
>> Twitter: http://twitter.com/fjcabral
>> e-mail: [hidden email]
>> Facebook: [hidden email]
>> Telegram: +55 (37) 99988-8868 <(37)%2099988-8868>
>> Wickr ID: fernandocabral
>> WhatsApp: +55 (37) 99988-8868 <(37)%2099988-8868>
>> Skype:  fernandojosecabral
>> Telefone fixo: +55 (37) 3521-2183 <(37)%203521-2183>
>> Telefone celular: +55 (37) 99988-8868 <(37)%2099988-8868>
>>
>> Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
>> nenhum político ou cientista poderá se gabar de nada.
>>
>>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

gambas-user mailing list
Le 15/07/2017 à 20:49, Tony Morehen a écrit :

> Did you try Benoit's suggestion:
>
> Public Sub Main()
>
>    Dim sIn as String
>    Dim sOut as String
>
>    sIn = File.Load("/home/fernando/temp/deah001.dhn")
>    sOut = Add11(sIn)
>    File.Save("/home/fernando/temp/deah001.11Added.dhn", sOut)
>
> End
>
> Public Sub Add11(InputString as String) as String
>    Dim bArray As Byte[]
>    Dim String11 As String
>    Dim i As Integer
>
>    bArray = Byte[].FromString(InputString)
>    For i = 0 To bArray.Max
>      bArray[i] += 11
>    Next
>   Return bArray.ToString
> End
>
>

Just a remark:

You don't have to use Byte[].FromString.

You can use the Bute[].Read() method instead, to load the file directly
into the array. You save an intermediate string that way.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

Caveat-2
Something is horribly wrong, or you're running on a 286 :-)

I just tested here, and the program runs on a 51 MB test file in about 5
seconds.

Some reasonably well commented code for you...

Public Sub Main()

   Dim inFile, outFile As File
   Dim buff As New Byte[1024]
   Dim idx, remBytes, readSize As Integer

   ' CHANGE THIS to your input file
   inFile = Open "/home/caveat/Downloads/mytestfile" For Read

   ' CHANGE THIS to your output file
   outFile = Open "/home/caveat/Downloads/mytestfile.out2" For Create

   ' Remaining bytes starts as the total length of the file
   remBytes = Lof(inFile)

   ' Until we reach the end of the input file...guess you could instead
check on remBytes...
   While Not Eof(inFile)
     If remBytes > buff.length Then
       ' Limit reading to the size of our buffer (the Byte[])
       readSize = buff.length
     Else
       ' Only read the bytes we have left into our buffer (the Byte[])
       readSize = remBytes
     Endif
     ' Read from the input file into our buffer, starting at offset 0 in
the buffer
     buff.Read(inFile, 0, readSize)
     ' Update the number of bytes remaining...
     remBytes = remBytes - readSize
     ' Run round each byte in our buffer
     For idx = 0 To buff.length - 1
       ' Dunno if you need any conditions, I check for > 30 as I can put
newlines in the file to make it more readable for testing
       If buff[idx] > 30 Then
         ' This is the 'trick' you need to apply... subtract 11 from
every byte in the file
         ' Not sure how you deal with edge cases... if you have a byte
of 5, is your result then 250?
         buff[idx] = buff[idx] - 11
       Endif
     Next
     ' Write the whole buffer out to the output file
     buff.Write(outFile, 0, readSize)
   Wend

   Close #inFile
   Close #outFile

End


Kind regards,
Caveat

On 15-07-17 21:24, Benoît Minisini via Gambas-user wrote:

> Le 15/07/2017 à 20:49, Tony Morehen a écrit :
>> Did you try Benoit's suggestion:
>>
>> Public Sub Main()
>>
>>    Dim sIn as String
>>    Dim sOut as String
>>
>>    sIn = File.Load("/home/fernando/temp/deah001.dhn")
>>    sOut = Add11(sIn)
>>    File.Save("/home/fernando/temp/deah001.11Added.dhn", sOut)
>>
>> End
>>
>> Public Sub Add11(InputString as String) as String
>>    Dim bArray As Byte[]
>>    Dim String11 As String
>>    Dim i As Integer
>>
>>    bArray = Byte[].FromString(InputString)
>>    For i = 0 To bArray.Max
>>      bArray[i] += 11
>>    Next
>>   Return bArray.ToString
>> End
>>
>>
>
> Just a remark:
>
> You don't have to use Byte[].FromString.
>
> You can use the Bute[].Read() method instead, to load the file
> directly into the array. You save an intermediate string that way.
>
> Regards,
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: My quest for efficiency

Fernando Cabral
Thank you, Caveat [emptor?].
The code you proposed worked very, very well.
In fact, I timed it against two versions of the C program and the result
was quite good.
In C, reading from the standard input and writing to the standard output
took a trifle beyond half a second (0.6?? real time).

Meanwhile, your version in Gambas ran in 2.5?? (real time). The question
mark means little variation from trial to trial.

Out of curiosity, I wrote a Gambas version similar to the C version. Here
are the two codes:

  *Dim b As Byte*




*  While Not Eof()    b = Read As Byte    Write (b + 11) As Byte  Wend*

- - - - -  - -





*  int c;  while((c = getchar()) != EOF)             putchar(c + 11);*

The C version ran in 0.59 against the Gambas version that ran in 7.6
seconds (real time). Not too bad in my opinion!

Then I tried to stretch it a little bit and wrote:


*  Write (( Read As Byte) + 11) As Byte*
Alas! this is something Gambas does not understand.

These are just for the sake of experience. I am happy with the solution
Caveat proposed.
Thank you.

2017-07-15 17:28 GMT-03:00 Caveat <[hidden email]>:

> Something is horribly wrong, or you're running on a 286 :-)
>
> I just tested here, and the program runs on a 51 MB test file in about 5
> seconds.
>
> Some reasonably well commented code for you...
>
> Public Sub Main()
>
>   Dim inFile, outFile As File
>   Dim buff As New Byte[1024]
>   Dim idx, remBytes, readSize As Integer
>
>   ' CHANGE THIS to your input file
>   inFile = Open "/home/caveat/Downloads/mytestfile" For Read
>
>   ' CHANGE THIS to your output file
>   outFile = Open "/home/caveat/Downloads/mytestfile.out2" For Create
>
>   ' Remaining bytes starts as the total length of the file
>   remBytes = Lof(inFile)
>
>   ' Until we reach the end of the input file...guess you could instead
> check on remBytes...
>   While Not Eof(inFile)
>     If remBytes > buff.length Then
>       ' Limit reading to the size of our buffer (the Byte[])
>       readSize = buff.length
>     Else
>       ' Only read the bytes we have left into our buffer (the Byte[])
>       readSize = remBytes
>     Endif
>     ' Read from the input file into our buffer, starting at offset 0 in
> the buffer
>     buff.Read(inFile, 0, readSize)
>     ' Update the number of bytes remaining...
>     remBytes = remBytes - readSize
>     ' Run round each byte in our buffer
>     For idx = 0 To buff.length - 1
>       ' Dunno if you need any conditions, I check for > 30 as I can put
> newlines in the file to make it more readable for testing
>       If buff[idx] > 30 Then
>         ' This is the 'trick' you need to apply... subtract 11 from every
> byte in the file
>         ' Not sure how you deal with edge cases... if you have a byte of
> 5, is your result then 250?
>         buff[idx] = buff[idx] - 11
>       Endif
>     Next
>     ' Write the whole buffer out to the output file
>     buff.Write(outFile, 0, readSize)
>   Wend
>
>   Close #inFile
>   Close #outFile
>
> End
>
>
> Kind regards,
> Caveat
>
> On 15-07-17 21:24, Benoît Minisini via Gambas-user wrote:
>
>> Le 15/07/2017 à 20:49, Tony Morehen a écrit :
>>
>>> Did you try Benoit's suggestion:
>>>
>>> Public Sub Main()
>>>
>>>    Dim sIn as String
>>>    Dim sOut as String
>>>
>>>    sIn = File.Load("/home/fernando/temp/deah001.dhn")
>>>    sOut = Add11(sIn)
>>>    File.Save("/home/fernando/temp/deah001.11Added.dhn", sOut)
>>>
>>> End
>>>
>>> Public Sub Add11(InputString as String) as String
>>>    Dim bArray As Byte[]
>>>    Dim String11 As String
>>>    Dim i As Integer
>>>
>>>    bArray = Byte[].FromString(InputString)
>>>    For i = 0 To bArray.Max
>>>      bArray[i] += 11
>>>    Next
>>>   Return bArray.ToString
>>> End
>>>
>>>
>>>
>> Just a remark:
>>
>> You don't have to use Byte[].FromString.
>>
>> You can use the Bute[].Read() method instead, to load the file directly
>> into the array. You save an intermediate string that way.
>>
>> Regards,
>>
>>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Gambas-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gambas-user
>



--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Loading...