New component gb.rss to generate and parse RSS documents

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

New component gb.rss to generate and parse RSS documents

Tobias Boege-2
Hello all,

I wrote an RSS feed generator for one of my projects recently and could
luckily complete also the parser before the new semester starts tomorrow.
So you get a gb.rss component in the latest revision #8117.

It should support all the things that are mentioned in the RSS 2.0
specification here [1], but there are still some problems:

  * The date conversion routines ignore timezones completely, because
    I have no clue about working with timezones in Gambas.

  * The strings you give to Rss* objects are put into the XML file
    verbatim currently, which is not desirable if these strings
    are HTML or some other things that confuse an RSS (XML) parser.
    I guess those strings should be wrapped in a CDATA section.
    I'd be grateful for a routine that tells me when to use CDATA.

The attached project demonstrates the feed generator part. It takes the
output of "ps", for the lack of a better data source, and turns it into
an RSS feed, which is then offered through gb.httpd. I tested the resulting
feed with newsbeuter. Just load the feed, start some new programs and then
reload the feed.

@Adrien or Fabien: While playing around with the XmlReader, I noticed some
problems. I tried to fix them to the best of my ability in #8116, but as
always it would be better if you looked through the matter again.

Regards,
Tobi

[1] http://cyber.harvard.edu/rss/rss.html

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user

rsstest-0.0.1.tar.gz (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 02/04/2017 à 18:07, Tobias Boege a écrit :

> Hello all,
>
> I wrote an RSS feed generator for one of my projects recently and could
> luckily complete also the parser before the new semester starts tomorrow.
> So you get a gb.rss component in the latest revision #8117.
>
> It should support all the things that are mentioned in the RSS 2.0
> specification here [1], but there are still some problems:
>
>   * The date conversion routines ignore timezones completely, because
>     I have no clue about working with timezones in Gambas.

As I don't know RSS, can you elaborate? What do you need to do exactly
with timezones?

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
On Sun, 02 Apr 2017, Benoît Minisini wrote:

> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
> > Hello all,
> >
> > I wrote an RSS feed generator for one of my projects recently and could
> > luckily complete also the parser before the new semester starts tomorrow.
> > So you get a gb.rss component in the latest revision #8117.
> >
> > It should support all the things that are mentioned in the RSS 2.0
> > specification here [1], but there are still some problems:
> >
> >   * The date conversion routines ignore timezones completely, because
> >     I have no clue about working with timezones in Gambas.
>
> As I don't know RSS, can you elaborate? What do you need to do exactly
> with timezones?
>

The items in an RSS feed (and the feed itself) contain publication dates
such as

  Sat, 07 Sep 2002 10:00:00 GMT

At the moment, when I read this string into a Date and use it in a Gambas
application, the timezone is ignored, i.e. it will be the 7th Sep 2002 at
10:00:00 *system-local timezone*, which is not correct. You can see this
when you serialise the RSS object into an XML document again. It gives:

  Sat, 07 Sep 2002 10:00:00 +0100

because my system is in +0100 now.

I don't know how this situation is best handled. The Gambas Date type is
not big enough to carry timezone information, is it? Then I would have to
convert the given time to the system timezone

  10:00:00 GMT -> 09:00:00 +0100

which results in the XML output

  Sat, 07 Sep 2002 09:00:00 +0100

later, which is not identical to the source but at least represents the same
point in time. But I could image that being able to set the target timezone
explicitly would be desirable, e.g. when your RSS feed item represents a
story in a German newspaper, but your server runs in a US timezone.

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 02/04/2017 à 22:34, Tobias Boege a écrit :

> On Sun, 02 Apr 2017, Benoît Minisini wrote:
>> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
>>> Hello all,
>>>
>>> I wrote an RSS feed generator for one of my projects recently and could
>>> luckily complete also the parser before the new semester starts tomorrow.
>>> So you get a gb.rss component in the latest revision #8117.
>>>
>>> It should support all the things that are mentioned in the RSS 2.0
>>> specification here [1], but there are still some problems:
>>>
>>>   * The date conversion routines ignore timezones completely, because
>>>     I have no clue about working with timezones in Gambas.
>>
>> As I don't know RSS, can you elaborate? What do you need to do exactly
>> with timezones?
>>
>
> The items in an RSS feed (and the feed itself) contain publication dates
> such as
>
>   Sat, 07 Sep 2002 10:00:00 GMT
>
> At the moment, when I read this string into a Date and use it in a Gambas
> application, the timezone is ignored, i.e. it will be the 7th Sep 2002 at
> 10:00:00 *system-local timezone*, which is not correct. You can see this
> when you serialise the RSS object into an XML document again. It gives:
>
>   Sat, 07 Sep 2002 10:00:00 +0100
>
> because my system is in +0100 now.
>
> I don't know how this situation is best handled. The Gambas Date type is
> not big enough to carry timezone information, is it? Then I would have to
> convert the given time to the system timezone
>
>   10:00:00 GMT -> 09:00:00 +0100
>
> which results in the XML output
>
>   Sat, 07 Sep 2002 09:00:00 +0100
>
> later, which is not identical to the source but at least represents the same
> point in time. But I could image that being able to set the target timezone
> explicitly would be desirable, e.g. when your RSS feed item represents a
> story in a German newspaper, but your server runs in a US timezone.
>
> Regards,
> Tobi
>

Date in Gambas are storead as a number of days and microseconds from a
specific origin, and are always considered as UTC.

They are converted to the timezone associated with the localisation when
you use Str() or Format() or Print.

To convert a date ti a specific timezone, you have to convert the date
part taken as UTC, and then you add (or substract I think, must be
checked!) the time zone value (which are hours).

Maybe this is an utility function to implement in gb.util.web...

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

zxMarce
I do not see the need to "force" developers needing timezone handling  functionality to add a reference to gb.util.web; expecting to have timezone capability in a web component is simply not intuitive.

Couldn't the timezone be a new property of the Date type/object, defaulting to the System's timezone? This way older programs and interpreters ignoring it would still work.

An alternative would be to add an (optional) timezone pareter to DateAdd, and/or a new Format specifier for timezone. I do not have it present now, but if Gambas has Dateserial/Timeserial functions, they could too receive timezone as optional info.

Just my two cents.
zxMarce

On Apr 2, 2017, 17:40, at 17:40, "Benoît Minisini" <[hidden email]> wrote:

>Le 02/04/2017 à 22:34, Tobias Boege a écrit :
>> On Sun, 02 Apr 2017, Benoît Minisini wrote:
>>> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
>>>> Hello all,
>>>>
>>>> I wrote an RSS feed generator for one of my projects recently and
>could
>>>> luckily complete also the parser before the new semester starts
>tomorrow.
>>>> So you get a gb.rss component in the latest revision #8117.
>>>>
>>>> It should support all the things that are mentioned in the RSS 2.0
>>>> specification here [1], but there are still some problems:
>>>>
>>>>   * The date conversion routines ignore timezones completely,
>because
>>>>     I have no clue about working with timezones in Gambas.
>>>
>>> As I don't know RSS, can you elaborate? What do you need to do
>exactly
>>> with timezones?
>>>
>>
>> The items in an RSS feed (and the feed itself) contain publication
>dates
>> such as
>>
>>   Sat, 07 Sep 2002 10:00:00 GMT
>>
>> At the moment, when I read this string into a Date and use it in a
>Gambas
>> application, the timezone is ignored, i.e. it will be the 7th Sep
>2002 at
>> 10:00:00 *system-local timezone*, which is not correct. You can see
>this
>> when you serialise the RSS object into an XML document again. It
>gives:
>>
>>   Sat, 07 Sep 2002 10:00:00 +0100
>>
>> because my system is in +0100 now.
>>
>> I don't know how this situation is best handled. The Gambas Date type
>is
>> not big enough to carry timezone information, is it? Then I would
>have to
>> convert the given time to the system timezone
>>
>>   10:00:00 GMT -> 09:00:00 +0100
>>
>> which results in the XML output
>>
>>   Sat, 07 Sep 2002 09:00:00 +0100
>>
>> later, which is not identical to the source but at least represents
>the same
>> point in time. But I could image that being able to set the target
>timezone
>> explicitly would be desirable, e.g. when your RSS feed item
>represents a
>> story in a German newspaper, but your server runs in a US timezone.
>>
>> Regards,
>> Tobi
>>
>
>Date in Gambas are storead as a number of days and microseconds from a
>specific origin, and are always considered as UTC.
>
>They are converted to the timezone associated with the localisation
>when
>you use Str() or Format() or Print.
>
>To convert a date ti a specific timezone, you have to convert the date
>part taken as UTC, and then you add (or substract I think, must be
>checked!) the time zone value (which are hours).
>
>Maybe this is an utility function to implement in gb.util.web...
>
>--
>Benoît Minisini
>
>------------------------------------------------------------------------------
>Check out the vibrant tech community on one of the world's most
>engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>_______________________________________________
>Gambas-user mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 03/04/2017 à 00:48, [hidden email] a écrit :

> I do not see the need to "force" developers needing timezone handling
> functionality to add a reference to gb.util.web; expecting to have
> timezone capability in a web component is simply not intuitive.
>
> Couldn't the timezone be a new property of the Date type/object,
> defaulting to the System's timezone? This way older programs and
> interpreters ignoring it would still work.
>
> An alternative would be to add an (optional) timezone pareter to
> DateAdd, and/or a new Format specifier for timezone. I do not have it
> present now, but if Gambas has Dateserial/Timeserial functions, they
> could too receive timezone as optional info.
>
> Just my two cents. zxMarce
>

A lot of confusion again...

Time is absolute in Gambas: there is no timezone associated with a
Gambas date/time value.

Timezone is a local concept. By the way you have System.Timezone to get
the local timezone of your system.

When I talk about gb.util.web, I think about a function that convert a
standard (RFC #xxxx) date/time string representation with a timezone to
a Gambas date/time value.

Of course, it can be in gb.util. It depends on the format. If it is a
specific web date format, it's logical to put it in gb.util.web instead.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

zxMarce
Got it.
Thanks Benoît.

On Apr 2, 2017, 20:51, at 20:51, "Benoît Minisini" <[hidden email]> wrote:

>Le 03/04/2017 à 00:48, [hidden email] a écrit :
>> I do not see the need to "force" developers needing timezone handling
>> functionality to add a reference to gb.util.web; expecting to have
>> timezone capability in a web component is simply not intuitive.
>>
>> Couldn't the timezone be a new property of the Date type/object,
>> defaulting to the System's timezone? This way older programs and
>> interpreters ignoring it would still work.
>>
>> An alternative would be to add an (optional) timezone pareter to
>> DateAdd, and/or a new Format specifier for timezone. I do not have it
>> present now, but if Gambas has Dateserial/Timeserial functions, they
>> could too receive timezone as optional info.
>>
>> Just my two cents. zxMarce
>>
>
>A lot of confusion again...
>
>Time is absolute in Gambas: there is no timezone associated with a
>Gambas date/time value.
>
>Timezone is a local concept. By the way you have System.Timezone to get
>
>the local timezone of your system.
>
>When I talk about gb.util.web, I think about a function that convert a
>standard (RFC #xxxx) date/time string representation with a timezone to
>
>a Gambas date/time value.
>
>Of course, it can be in gb.util. It depends on the format. If it is a
>specific web date format, it's logical to put it in gb.util.web
>instead.
>
>Regards,
>
>--
>Benoît Minisini
>
>------------------------------------------------------------------------------
>Check out the vibrant tech community on one of the world's most
>engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>_______________________________________________
>Gambas-user mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Adrien Prokopowicz-2
In reply to this post by Tobias Boege-2
Le Sun, 02 Apr 2017 18:07:26 +0200, Tobias Boege <[hidden email]> a  
écrit:

> Hello all,
>
> I wrote an RSS feed generator for one of my projects recently and could
> luckily complete also the parser before the new semester starts tomorrow.
> So you get a gb.rss component in the latest revision #8117.
>
> It should support all the things that are mentioned in the RSS 2.0
> specification here [1], but there are still some problems:
>
>   * The date conversion routines ignore timezones completely, because
>     I have no clue about working with timezones in Gambas.
>
>   * The strings you give to Rss* objects are put into the XML file
>     verbatim currently, which is not desirable if these strings
>     are HTML or some other things that confuse an RSS (XML) parser.
>     I guess those strings should be wrapped in a CDATA section.
>     I'd be grateful for a routine that tells me when to use CDATA.
>
> The attached project demonstrates the feed generator part. It takes the
> output of "ps", for the lack of a better data source, and turns it into
> an RSS feed, which is then offered through gb.httpd. I tested the  
> resulting
> feed with newsbeuter. Just load the feed, start some new programs and  
> then
> reload the feed.
>
> @Adrien or Fabien: While playing around with the XmlReader, I noticed  
> some
> problems. I tried to fix them to the best of my ability in #8116, but as
> always it would be better if you looked through the matter again.
>
> Regards,
> Tobi
>
> [1] http://cyber.harvard.edu/rss/rss.html
>

Hi Tobias,

First, thanks for the fixes. :)
The only thing that bothers me a bit is throwing an error if no attribute
is found when using the _get method on XmlReader.Node.Attributes.
It is made so it behaves like a Gambas collection (as XML Attributes can
be seen as a simple key-value store), so I think it is better to just
return Null in that case.

Other than that, it all looks good to me, so thanks again. :)

For your escaping problem, the XmlWriter.Element() now escapes the value
parameter so it is always safe to pass any String (all the other methods
dealing with text content already do that, like XmlWriter.Text() or
XmlNode.TextContent).

Currently, you can check if a string needs serialization by actually
serializing it and comparing it to the original string :

     If XmlNode.Serialize(Value) = Value Then ' [...]

It's not optimal if you're dealing with very large strings, but it works.
However, if some elements of the RSS document are likely to contain large  
HTML
or some other markup, I think it's better performance-wise to just
always wrap their contents in a CDATA node, as their contents are not  
checked.

As a side node, all these problems with the XmlReader & Writer are  
starting to
annoy me quite a bit (as those are mostly regressions). I think I should  
bring
back the unit testing library I wrote some time ago, and write some tests  
for
gb.xml locally at least.

Regards,

--
Adrien Prokopowicz

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
In reply to this post by Tobias Boege-2
On Sun, 02 Apr 2017, Tobias Boege wrote:
> Hello all,
>
> I wrote an RSS feed generator for one of my projects recently and could
> luckily complete also the parser before the new semester starts tomorrow.
> So you get a gb.rss component in the latest revision #8117.
>

@Benoit: I'm currently writing the documentation for this (inside the
source files at first). How did adding documentation to the wiki go again?
Do you have to poke a script on the server side which creates template
pages or do I just create the pages myself?

And before you do that: I intend to implement Atom besides RSS, too.
Do you think the name gb.rss is good enough to capture that? Another
possibility would be gb.webfeed, I guess. "Web syndication" seems to
be yet another word which encompasses RSS and Atom...

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 11/04/2017 à 12:53, Tobias Boege a écrit :

> On Sun, 02 Apr 2017, Tobias Boege wrote:
>> Hello all,
>>
>> I wrote an RSS feed generator for one of my projects recently and could
>> luckily complete also the parser before the new semester starts tomorrow.
>> So you get a gb.rss component in the latest revision #8117.
>>
>
> @Benoit: I'm currently writing the documentation for this (inside the
> source files at first). How did adding documentation to the wiki go again?
> Do you have to poke a script on the server side which creates template
> pages or do I just create the pages myself?

Normally, once I have updated the component info files on the wiki
server, creating a new page automatically peeks the documentation lines
from the source code.

>
> And before you do that: I intend to implement Atom besides RSS, too.
> Do you think the name gb.rss is good enough to capture that? Another
> possibility would be gb.webfeed, I guess. "Web syndication" seems to
> be yet another word which encompasses RSS and Atom...

Mmm... 'gb.web.syndication' so ?

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
On Tue, 11 Apr 2017, Benoît Minisini wrote:

> Le 11/04/2017 à 12:53, Tobias Boege a écrit :
> > On Sun, 02 Apr 2017, Tobias Boege wrote:
> >> Hello all,
> >>
> >> I wrote an RSS feed generator for one of my projects recently and could
> >> luckily complete also the parser before the new semester starts tomorrow.
> >> So you get a gb.rss component in the latest revision #8117.
> >>
> >
> > @Benoit: I'm currently writing the documentation for this (inside the
> > source files at first). How did adding documentation to the wiki go again?
> > Do you have to poke a script on the server side which creates template
> > pages or do I just create the pages myself?
>
> Normally, once I have updated the component info files on the wiki
> server, creating a new page automatically peeks the documentation lines
> from the source code.
>

Cool!

> >
> > And before you do that: I intend to implement Atom besides RSS, too.
> > Do you think the name gb.rss is good enough to capture that? Another
> > possibility would be gb.webfeed, I guess. "Web syndication" seems to
> > be yet another word which encompasses RSS and Atom...
>
> Mmm... 'gb.web.syndication' so ?
>

Reading a little, it seems that syndication is more of a specific practice
which makes use of syndication formats like RSS and Atom. To be honest I
haven't heard the term syndication in that context before. Wikipedia, e.g.,
refers to these formats as "web feeds", too. I'm not aware of any danger
of confusion with other web technology if I name it "gb.web.feed" instead.

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 11/04/2017 à 15:28, Tobias Boege a écrit :

>>
>> Mmm... 'gb.web.syndication' so ?
>>
>
> Reading a little, it seems that syndication is more of a specific practice
> which makes use of syndication formats like RSS and Atom. To be honest I
> haven't heard the term syndication in that context before. Wikipedia, e.g.,
> refers to these formats as "web feeds", too. I'm not aware of any danger
> of confusion with other web technology if I name it "gb.web.feed" instead.
>
> Regards,
> Tobi
>

Let's go for "gb.web.feed" then.

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
In reply to this post by Tobias Boege-2
Le 02/04/2017 à 18:07, Tobias Boege a écrit :
>
>   * The date conversion routines ignore timezones completely, because
>     I have no clue about working with timezones in Gambas.
>

Hi, Tobias.

I have added in revision #8122 two functions to the gb.util component:

Date.ToRFC822(), to convert a Gambas date/time value to its RFC822
string representation, with the timezone.

Date.FromRFC822(), to do the contrary.

Tell me if you can use them, and if you need me to add the same
functions for RFC3339 date format used by Atom.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
On Sun, 16 Apr 2017, Benoît Minisini wrote:

> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
> >
> >   * The date conversion routines ignore timezones completely, because
> >     I have no clue about working with timezones in Gambas.
> >
>
> Hi, Tobias.
>
> I have added in revision #8122 two functions to the gb.util component:
>
> Date.ToRFC822(), to convert a Gambas date/time value to its RFC822
> string representation, with the timezone.
>
> Date.FromRFC822(), to do the contrary.
>
> Tell me if you can use them, and if you need me to add the same
> functions for RFC3339 date format used by Atom.
>

Thanks for these. Reading the code (not testing it yet), I noticed four
things:

  * The weekday and second parts in the format are optional in the RFC
    but mandatory in your parser.

  * The year is a 2-digit number in the RFC. The RSS spec says it prefers
    4 digits. My current parser in gb.web.feed supports both, but treats
    2-digit years XY as 19XY (which I think is the most sensible
    interpretation with respect to the RFC but sadly excludes publication
    dates for news items near Christ's birth).

  * There is no consistency check in the parser if, in case a weekday is
    given, it matches the weekday of the date, like

      Fri, 18 Apr 2017 12:00:00 GMT

    would be invalid by the RFC ("5.2 SEMANTICS"), because the 18 Apr 2017
    is a Tuesday.

  * At one point you use Format$(..., "hh:nn:ss") which I think may be
    dangerous, because Format$() (as per docs) replaces ":" by the locale-
    specific time separator. I don't know if there are locales where this
    is different from ":", but the RFC requires it to be ":" exactly.

My parser does these four things. If you want to add them, the gb.web.feed
code is sufficiently commented in the relevant places.

As for Atom, I don't have a definite plan for when I'll add these classes
(haven't even read the specs yet). I don't need the functions right now,
at least.

About the incorporation of timezones in gb.web.feed (the last thing before
I mark the component as "Unfinished but stable"), my plan is to replace the
Date variables in the Rss* classes by an RssDate compound, consisting of a
normalised Date and a Timezone string (or constant), with an "apply timezone"
method probably. If you have a better idea, please let me know.

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
Le 18/04/2017 à 19:44, Tobias Boege a écrit :

> On Sun, 16 Apr 2017, Benoît Minisini wrote:
>> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
>>>
>>>   * The date conversion routines ignore timezones completely, because
>>>     I have no clue about working with timezones in Gambas.
>>>
>>
>> Hi, Tobias.
>>
>> I have added in revision #8122 two functions to the gb.util component:
>>
>> Date.ToRFC822(), to convert a Gambas date/time value to its RFC822
>> string representation, with the timezone.
>>
>> Date.FromRFC822(), to do the contrary.
>>
>> Tell me if you can use them, and if you need me to add the same
>> functions for RFC3339 date format used by Atom.
>>
>
> Thanks for these. Reading the code (not testing it yet), I noticed four
> things:
>
>   * The weekday and second parts in the format are optional in the RFC
>     but mandatory in your parser.
>
>   * The year is a 2-digit number in the RFC. The RSS spec says it prefers
>     4 digits. My current parser in gb.web.feed supports both, but treats
>     2-digit years XY as 19XY (which I think is the most sensible
>     interpretation with respect to the RFC but sadly excludes publication
>     dates for news items near Christ's birth).
>
>   * There is no consistency check in the parser if, in case a weekday is
>     given, it matches the weekday of the date, like
>
>       Fri, 18 Apr 2017 12:00:00 GMT
>
>     would be invalid by the RFC ("5.2 SEMANTICS"), because the 18 Apr 2017
>     is a Tuesday.
>
>   * At one point you use Format$(..., "hh:nn:ss") which I think may be
>     dangerous, because Format$() (as per docs) replaces ":" by the locale-
>     specific time separator. I don't know if there are locales where this
>     is different from ":", but the RFC requires it to be ":" exactly.
>
> My parser does these four things. If you want to add them, the gb.web.feed
> code is sufficiently commented in the relevant places.

OK, I will look at it.

>
> As for Atom, I don't have a definite plan for when I'll add these classes
> (haven't even read the specs yet). I don't need the functions right now,
> at least.
>
> About the incorporation of timezones in gb.web.feed (the last thing before
> I mark the component as "Unfinished but stable"), my plan is to replace the
> Date variables in the Rss* classes by an RssDate compound, consisting of a
> normalised Date and a Timezone string (or constant), with an "apply timezone"
> method probably. If you have a better idea, please let me know.

What for? Timezone is only needed when dealing with Date as strings.
Internally, all Date values should be stored UTC. In computing, time is
absolute.

In other words, every date value has as many string representations as
the number of possible timezones.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
On Tue, 18 Apr 2017, Benoît Minisini wrote:

> > About the incorporation of timezones in gb.web.feed (the last thing before
> > I mark the component as "Unfinished but stable"), my plan is to replace the
> > Date variables in the Rss* classes by an RssDate compound, consisting of a
> > normalised Date and a Timezone string (or constant), with an "apply timezone"
> > method probably. If you have a better idea, please let me know.
>
> What for? Timezone is only needed when dealing with Date as strings.
> Internally, all Date values should be stored UTC. In computing, time is
> absolute.
>
> In other words, every date value has as many string representations as
> the number of possible timezones.
>

I agree, but in the conversion Gambas object -> XML string, I want to give
the user a way to specify the timezone that is printed in the string
representation, as you do in ToRFC822().

Since this timezone may be different for every RssItem (or even different
between PubDate and LastBuildDate for the same RssItem), I have to store
it somewhere along the normalised (UTC) Date and not have it passed as an
argument to Rss.ToString().

When reading XML string -> Gambas object this timezone property would be
rather useless, yes.

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
In reply to this post by Tobias Boege-2
I have took a look at your code.

Just a general remark: the name of the exported function arguments
should help the user to guess what it means.

So, for example:

Static Public Function FormatDate(Dat As Date) As String

should be:

Static Public Function FormatDate({Date} As Date) As String

Better use full words instead of abbreviations! "{Date}" instead of
"Dat", "{String}" (or "Value") instead of "Str", and so on.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Benoît Minisini
In reply to this post by Benoît Minisini
Le 18/04/2017 à 19:52, Benoît Minisini a écrit :

> Le 18/04/2017 à 19:44, Tobias Boege a écrit :
>> On Sun, 16 Apr 2017, Benoît Minisini wrote:
>>> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
>>>>
>>>>   * The date conversion routines ignore timezones completely, because
>>>>     I have no clue about working with timezones in Gambas.
>>>>
>>>
>>> Hi, Tobias.
>>>
>>> I have added in revision #8122 two functions to the gb.util component:
>>>
>>> Date.ToRFC822(), to convert a Gambas date/time value to its RFC822
>>> string representation, with the timezone.
>>>
>>> Date.FromRFC822(), to do the contrary.
>>>
>>> Tell me if you can use them, and if you need me to add the same
>>> functions for RFC3339 date format used by Atom.
>>>
>>
>> Thanks for these. Reading the code (not testing it yet), I noticed four
>> things:
>>
>>   * The weekday and second parts in the format are optional in the RFC
>>     but mandatory in your parser.
>>
>>   * The year is a 2-digit number in the RFC. The RSS spec says it prefers
>>     4 digits. My current parser in gb.web.feed supports both, but treats
>>     2-digit years XY as 19XY (which I think is the most sensible
>>     interpretation with respect to the RFC but sadly excludes publication
>>     dates for news items near Christ's birth).
>>
>>   * There is no consistency check in the parser if, in case a weekday is
>>     given, it matches the weekday of the date, like
>>
>>       Fri, 18 Apr 2017 12:00:00 GMT
>>
>>     would be invalid by the RFC ("5.2 SEMANTICS"), because the 18 Apr
>> 2017
>>     is a Tuesday.
>>
>>   * At one point you use Format$(..., "hh:nn:ss") which I think may be
>>     dangerous, because Format$() (as per docs) replaces ":" by the
>> locale-
>>     specific time separator. I don't know if there are locales where this
>>     is different from ":", but the RFC requires it to be ":" exactly.
>>
>> My parser does these four things. If you want to add them, the
>> gb.web.feed
>> code is sufficiently commented in the relevant places.
>
> OK, I will look at it.
>

I have updated the gb.util components with your fixes. Now I think you
can use it directly in gb.web.feed.

Regards,

--
Benoît Minisini

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user
Reply | Threaded
Open this post in threaded view
|

Re: New component gb.rss to generate and parse RSS documents

Tobias Boege-2
On Tue, 18 Apr 2017, Benoît Minisini wrote:

> Le 18/04/2017 à 19:52, Benoît Minisini a écrit :
> > Le 18/04/2017 à 19:44, Tobias Boege a écrit :
> >> On Sun, 16 Apr 2017, Benoît Minisini wrote:
> >>> Le 02/04/2017 à 18:07, Tobias Boege a écrit :
> >>>>
> >>>>   * The date conversion routines ignore timezones completely, because
> >>>>     I have no clue about working with timezones in Gambas.
> >>>>
> >>>
> >>> Hi, Tobias.
> >>>
> >>> I have added in revision #8122 two functions to the gb.util component:
> >>>
> >>> Date.ToRFC822(), to convert a Gambas date/time value to its RFC822
> >>> string representation, with the timezone.
> >>>
> >>> Date.FromRFC822(), to do the contrary.
> >>>
> >>> Tell me if you can use them, and if you need me to add the same
> >>> functions for RFC3339 date format used by Atom.
> >>>
> >>
> >> Thanks for these. Reading the code (not testing it yet), I noticed four
> >> things:
> >>
> >>   * The weekday and second parts in the format are optional in the RFC
> >>     but mandatory in your parser.
> >>
> >>   * The year is a 2-digit number in the RFC. The RSS spec says it prefers
> >>     4 digits. My current parser in gb.web.feed supports both, but treats
> >>     2-digit years XY as 19XY (which I think is the most sensible
> >>     interpretation with respect to the RFC but sadly excludes publication
> >>     dates for news items near Christ's birth).
> >>
> >>   * There is no consistency check in the parser if, in case a weekday is
> >>     given, it matches the weekday of the date, like
> >>
> >>       Fri, 18 Apr 2017 12:00:00 GMT
> >>
> >>     would be invalid by the RFC ("5.2 SEMANTICS"), because the 18 Apr
> >> 2017
> >>     is a Tuesday.
> >>
> >>   * At one point you use Format$(..., "hh:nn:ss") which I think may be
> >>     dangerous, because Format$() (as per docs) replaces ":" by the
> >> locale-
> >>     specific time separator. I don't know if there are locales where this
> >>     is different from ":", but the RFC requires it to be ":" exactly.
> >>
> >> My parser does these four things. If you want to add them, the
> >> gb.web.feed
> >> code is sufficiently commented in the relevant places.
> >
> > OK, I will look at it.
> >
>
> I have updated the gb.util components with your fixes. Now I think you
> can use it directly in gb.web.feed.
>

I think there is something wrong with at least one of these functions.
In FromRFC822() you do towards the end:

  (1)  dDate -= Frac(Date(Now))
  (2)  dDate += GetRFC822Zone(aDate[6])

What is (1) supposed to do? Isn't Frac(Date(Now)) the current time?

Regarding (2), I think I could (after an hour) wrap my head around when
to add and when to subtract the timezone offsets in a conversion and
FromRFC822() should *subtract* the timezone offset from the date, like

  18:00:00 +0200

is the same point in time as

  16:00:00 +0000

(isn't it?)

And in ToRFC822() you do something with System.TimeZone. Isn't a Date value
assumed to be relative to UTC, or +0000? So why bring the system's timezone
into the equation when a target timezone is given by the TimeZone argument?
Or is the Date input supposed to be in the local timezone? Then ToRFC822()
and FromRFC822() wouldn't be inverse to each other...

For example, I'm in +0200 currently (I think) and converting 6pm as a time
in GMT to GMT again results in 2pm:

  Print Date.ToRFC822(Date.FromRFC822("27 Apr 2017 18:00:00 GMT"), "GMT")
  Thu, 27 Apr 2017 14:0:0 GMT

You can also see that ToRFC822() misses Format(..., "00") calls in the time
components.

And could you add an Optional ByRef TimeZone As String argument to
FromRFC822() and set it to aDate[6] just before the function returns?
The rationale is: in gb.web.feed I still want to have an RssDate class
with a Date and a TimeZone member. When reading a date, I want to initialise
the TimeZone member with the timezone given in the RFC822 string, so that
reading an RSS document into an Rss object and then serialising that Rss
object immediately afterwards doesn't change all the timezone strings in
the document.

Regards,
Tobi

--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gambas-user