Mystery with file.load and regexp

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Mystery with file.load and regexp

Fernando Cabral
Concerning RegExp I have another mysterious thing to understand

If I do something like:








*Searchfor.Push("Word")Searchfor.Push("Power")Searchfor.Push("The same")For
Each searchedfor In searchfor   re.Compile(searchedfor, re.utf8)Next*
The expression gest compiled. No error.
Neverthelesse, if I the same words from a file, using this expression:

*Dim Searchfor As New String[] = Split(File.Load("Strings"), "\n")*

re.Compile will not work. It will display an error message saying there is
nothing to compile. Now, if I do:


*print "@" & Searchfor[n] &"@\n"*
in both cases I will see precisely the same output. I can't distinguish one
from the other. So, why it compiles in the first case, but does not in the
second?

This is the mystery I must solve with a little help from a good soul out
there.

- fernando



--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mystery with file.load and regexp

Fernando Cabral
I have found and worked around the problem. When you do the following:

*Dim Expressions as string[] =
Split(File.Load("/home/fernando/.config/libreoffice/4/user/basic/indesejaveis.txt"),
"\n") *

The last item pushed into *Expressions* is an empty string ("") even though
it DOES NOT exist
in the file. So, the expressions are compiled one after the other til the
last one, that is empty. Then the program crashes.

So it seems there is a bug in the *load.file()*. Bug that I was able to
compensate for by adding the option *True* in the call to the functions
*split()*.

2017-07-04 22:29 GMT-03:00 Fernando Cabral <[hidden email]>:

> Concerning RegExp I have another mysterious thing to understand
>
> If I do something like:
>
>
>
>
>
>
>
>
> *Searchfor.Push("Word")Searchfor.Push("Power")Searchfor.Push("The
> same")For Each searchedfor In searchfor   re.Compile(searchedfor,
> re.utf8)Next*
> The expression gest compiled. No error.
> Neverthelesse, if I the same words from a file, using this expression:
>
> *Dim Searchfor As New String[] = Split(File.Load("Strings"), "\n")*
>
> re.Compile will not work. It will display an error message saying there is
> nothing to compile. Now, if I do:
>
>
> *print "@" & Searchfor[n] &"@\n"*
> in both cases I will see precisely the same output. I can't distinguish
> one from the other. So, why it compiles in the first case, but does not in
> the second?
>
> This is the mystery I must solve with a little help from a good soul out
> there.
>
> - fernando
>
>
>
> --
> Fernando Cabral
> Blogue: http://fernandocabral.org
> Twitter: http://twitter.com/fjcabral
> e-mail: [hidden email]
> Facebook: [hidden email]
> Telegram: +55 (37) 99988-8868 <(37)%2099988-8868>
> Wickr ID: fernandocabral
> WhatsApp: +55 (37) 99988-8868 <(37)%2099988-8868>
> Skype:  fernandojosecabral
> Telefone fixo: +55 (37) 3521-2183 <(37)%203521-2183>
> Telefone celular: +55 (37) 99988-8868 <(37)%2099988-8868>
>
> Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
> nenhum político ou cientista poderá se gabar de nada.
>
>


--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mystery with file.load and regexp

Tony Morehen
File.Load is working fine.  It is just loading a file that has a
trailing "\n".  Split then adds an empty string as the last entry in the
array.  This is Split's documented behaviour.  Adding the True option
suppresses empty entries.  Note that Split will also add an empty entry
for any blank lines ie "\n\n".  The True option will also suppress those
empty entries.

A word of caution:  the True option will not suppress lines that are
comprised entirely of spaces.  I'd test to see if those kind of lines
also cause problems.


On 2017-07-04 10:28 PM, Fernando Cabral wrote:

> I have found and worked around the problem. When you do the following:
>
> *Dim Expressions as string[] =
> Split(File.Load("/home/fernando/.config/libreoffice/4/user/basic/indesejaveis.txt"),
> "\n") *
>
> The last item pushed into *Expressions* is an empty string ("") even though
> it DOES NOT exist
> in the file. So, the expressions are compiled one after the other til the
> last one, that is empty. Then the program crashes.
>
> So it seems there is a bug in the *load.file()*. Bug that I was able to
> compensate for by adding the option *True* in the call to the functions
> *split()*.
>
> 2017-07-04 22:29 GMT-03:00 Fernando Cabral <[hidden email]>:
>
>> Concerning RegExp I have another mysterious thing to understand
>>
>> If I do something like:
>>
>>
>>
>>
>>
>>
>>
>>
>> *Searchfor.Push("Word")Searchfor.Push("Power")Searchfor.Push("The
>> same")For Each searchedfor In searchfor   re.Compile(searchedfor,
>> re.utf8)Next*
>> The expression gest compiled. No error.
>> Neverthelesse, if I the same words from a file, using this expression:
>>
>> *Dim Searchfor As New String[] = Split(File.Load("Strings"), "\n")*
>>
>> re.Compile will not work. It will display an error message saying there is
>> nothing to compile. Now, if I do:
>>
>>
>> *print "@" & Searchfor[n] &"@\n"*
>> in both cases I will see precisely the same output. I can't distinguish
>> one from the other. So, why it compiles in the first case, but does not in
>> the second?
>>
>> This is the mystery I must solve with a little help from a good soul out
>> there.
>>
>> - fernando
>>
>>
>>
>> --
>> Fernando Cabral
>> Blogue: http://fernandocabral.org
>> Twitter: http://twitter.com/fjcabral
>> e-mail: [hidden email]
>> Facebook: [hidden email]
>> Telegram: +55 (37) 99988-8868 <(37)%2099988-8868>
>> Wickr ID: fernandocabral
>> WhatsApp: +55 (37) 99988-8868 <(37)%2099988-8868>
>> Skype:  fernandojosecabral
>> Telefone fixo: +55 (37) 3521-2183 <(37)%203521-2183>
>> Telefone celular: +55 (37) 99988-8868 <(37)%2099988-8868>
>>
>> Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
>> nenhum político ou cientista poderá se gabar de nada.
>>
>>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mystery with file.load and regexp

adamnt42@gmail.com
In reply to this post by Fernando Cabral
On 05/07/17 11:58, Fernando Cabral wrote:

> I have found and worked around the problem. When you do the following:
>
> *Dim Expressions as string[] =
> Split(File.Load("/home/fernando/.config/libreoffice/4/user/basic/indesejaveis.txt"),
> "\n") *
>
> The last item pushed into *Expressions* is an empty string ("") even though
> it DOES NOT exist
> in the file. So, the expressions are compiled one after the other til the
> last one, that is empty. Then the program crashes.
>
> So it seems there is a bug in the *load.file()*. Bug that I was able to
> compensate for by adding the option *True* in the call to the functions
> *split()*.
>

>
I think you might find that the last character of indesejaveis.txt is a
\n (as is the case for many, many files) so in actual fact your original
split did exactly what it was supposed to do.

And your work around is correct. Although it is not actually a work
around it is a very common construct when dealing with text files.

So, I very much doubt there is a bug in File.Load()

b

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mystery with file.load and regexp

Fernando Cabral
bb wrote:

>I think you might find that the last character of indesejaveis.txt is a \n
(as is the case for many, many > files) so in actual fact your original
split did exactly what it was supposed to do.
>
> And your work around is correct. Although it is not actually a work
around it is a very
> common  construct when dealing with text files.

> So, I very much doubt there is a bug in File.Load()

In the beginning, I thought I File.Load() had an idiosyncratic behavior.
Then I understood that's not true. It just reads everything from the file
into the variable. So, it is good behavior when it brings in the last "\n".

Then I thought Split() had a strange behavior since it brought in something
that did not exist after the last separator, that is, an empty string.

Then, I discovered I was wrong on this account: if I have a separator as
the last real element, then after that what I have is an empty string. So,
I can't blame Split() for bringing it in. Unless I explicitly tell Split()
to drop it.

Yes, in this case, the bug was in my brain. It is called lack of knowledge
or faulty reasoning.

Thank you bb and Tony for showing me the light.

- fernando


2017-07-05 0:43 GMT-03:00 bb <[hidden email]>:

> On 05/07/17 11:58, Fernando Cabral wrote:
>
>> I have found and worked around the problem. When you do the following:
>>
>> *Dim Expressions as string[] =
>> Split(File.Load("/home/fernando/.config/libreoffice/4/user/
>> basic/indesejaveis.txt"),
>> "\n") *
>>
>> The last item pushed into *Expressions* is an empty string ("") even
>> though
>> it DOES NOT exist
>> in the file. So, the expressions are compiled one after the other til the
>> last one, that is empty. Then the program crashes.
>>
>> So it seems there is a bug in the *load.file()*. Bug that I was able to
>> compensate for by adding the option *True* in the call to the functions
>> *split()*.
>>
>>
>
>> I think you might find that the last character of indesejaveis.txt is a
> \n (as is the case for many, many files) so in actual fact your original
> split did exactly what it was supposed to do.
>
> And your work around is correct. Although it is not actually a work around
> it is a very common construct when dealing with text files.
>
> So, I very much doubt there is a bug in File.Load()
>
> b
>



--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: [hidden email]
Facebook: [hidden email]
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype:  fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Loading...