Hi,
Currently, most of the R packages included in Fedora use the following lines in the SPEC:
URL: https://cran.r-project.org/web/packages/%%7Bpackname%7D/ Source0: https://cran.r-project.org/src/contrib/%%7Bpackname%7D_%%7Bversion%7D.tar.gz
This URL format is not recommended by CRAN, but more importantly, the Source0 format does not work anymore, as [1] noted, when a new version is released. However, there is an immutable format available, as [2] pointed out. So my proposal is to use always the following lines instead:
URL: https://cran.r-project.org/package=%%7Bpackname%7D Source0: %{url}&version=%{version}
which are both shorter and immutable, and I propose to add this to the R packaging guidelines too.
If we agree on this, is there any easy way to request a system-wide change like that to all existing packages?
Regards,
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This URL format is not recommended by CRAN, but more importantly, IU> the Source0 format does not work anymore, as [1] noted, when a new IU> version is released. However, there is an immutable format IU> available, as [2] pointed out. So my proposal is to use always the IU> following lines instead:
IU> URL: https://cran.r-project.org/package=%%7Bpackname%7D IU> Source0: %{url}&version=%{version}
That's good to know. We should consider an %r_source macro similar to the existing %pypi_source macro which is used for python packages. See: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#_source_fi...
I would be happy to write it, but I don't enough about R packaging to know how uniformly the tags you mention above (%packname, specifically) are adhered to in the current packaging set.
IU> which are both shorter and immutable, and I propose to add this to IU> the R packaging guidelines too.
Well you should certainly open a ticket with the packaging committee if you wish to propose changes to the packaging guidelines. https://pagure.io/packaging-committee/
You can even send a pull request.
IU> If we agree on this, is there any easy way to request a system-wide IU> change like that to all existing packages?
https://fedoraproject.org/wiki/Mass_package_changes
- J<
On Wed, 24 Oct 2018 at 15:02, Jason L Tibbitts III tibbs@math.uh.edu wrote:
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This URL format is not recommended by CRAN, but more importantly, IU> the Source0 format does not work anymore, as [1] noted, when a new IU> version is released. However, there is an immutable format IU> available, as [2] pointed out. So my proposal is to use always the IU> following lines instead:
IU> URL: https://cran.r-project.org/package=%%7Bpackname%7D IU> Source0: %{url}&version=%{version}
That's good to know. We should consider an %r_source macro similar to the existing %pypi_source macro which is used for python packages. See: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#_source_fi...
That would be certainly great. I would call it %cran_source instead, and a %cran_url would be helpful too, so that we can write the following:
URL: %cran_url Source0: %cran_source
or maybe something like %cran_origin could directly resolve into the above two lines.
I would be happy to write it, but I don't enough about R packaging to know how uniformly the tags you mention above (%packname, specifically) are adhered to in the current packaging set.
%packname is a global variable defined by the R2spec util. I've checked (script attached in case you're interested) and basically all the packages use this global. So the new macro(s) would accept up to two arguments: the name of the R package, which should default to %packname if defined, and the version, by default %version.
IU> which are both shorter and immutable, and I propose to add this to IU> the R packaging guidelines too.
Well you should certainly open a ticket with the packaging committee if you wish to propose changes to the packaging guidelines. https://pagure.io/packaging-committee/
You can even send a pull request.
Great. I'll wait for that if you're going to push forward this new macro.
IU> If we agree on this, is there any easy way to request a system-wide IU> change like that to all existing packages?
Thanks for the pointer.
- J<
-- Iñaki Ucar
Kind reminder. Any update on this?
Iñaki
On Wed, 24 Oct 2018 at 17:01, Iñaki Ucar iucar@fedoraproject.org wrote:
On Wed, 24 Oct 2018 at 15:02, Jason L Tibbitts III tibbs@math.uh.edu wrote:
> "IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This URL format is not recommended by CRAN, but more importantly, IU> the Source0 format does not work anymore, as [1] noted, when a new IU> version is released. However, there is an immutable format IU> available, as [2] pointed out. So my proposal is to use always the IU> following lines instead:
IU> URL: https://cran.r-project.org/package=%%7Bpackname%7D IU> Source0: %{url}&version=%{version}
That's good to know. We should consider an %r_source macro similar to the existing %pypi_source macro which is used for python packages. See: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#_source_fi...
That would be certainly great. I would call it %cran_source instead, and a %cran_url would be helpful too, so that we can write the following:
URL: %cran_url Source0: %cran_source
or maybe something like %cran_origin could directly resolve into the above two lines.
I would be happy to write it, but I don't enough about R packaging to know how uniformly the tags you mention above (%packname, specifically) are adhered to in the current packaging set.
%packname is a global variable defined by the R2spec util. I've checked (script attached in case you're interested) and basically all the packages use this global. So the new macro(s) would accept up to two arguments: the name of the R package, which should default to %packname if defined, and the version, by default %version.
IU> which are both shorter and immutable, and I propose to add this to IU> the R packaging guidelines too.
Well you should certainly open a ticket with the packaging committee if you wish to propose changes to the packaging guidelines. https://pagure.io/packaging-committee/
You can even send a pull request.
Great. I'll wait for that if you're going to push forward this new macro.
IU> If we agree on this, is there any easy way to request a system-wide IU> change like that to all existing packages?
Thanks for the pointer.
- J<
-- Iñaki Ucar
On Tuesday, 30 October 2018 at 16:36, Iñaki Ucar wrote:
Kind reminder. Any update on this?
Jason asked you to open a ticket in the Packaging Committee tracker and open a pull request against the guidelines. Have you done that?
Regards, Dominik
Hi Dominik,
On Wed, 31 Oct 2018 at 00:43, Dominik 'Rathann' Mierzejewski dominik@greysector.net wrote:
On Tuesday, 30 October 2018 at 16:36, Iñaki Ucar wrote:
Kind reminder. Any update on this?
Jason asked you to open a ticket in the Packaging Committee tracker and open a pull request against the guidelines. Have you done that?
I thought that the existence of the new macro was a precondition to open an issue or a PR against the guidelines recommending it. That's why I wrote the following:
On Wed, 24 Oct 2018 at 17:01, Iñaki Ucar iucar@fedoraproject.org wrote:
Great. I'll wait for that if you're going to push forward this new macro.
But if you confirm that this is not the case, I certainly misunderstood Jason's email, so I'll do it right away.
I've been trying to stay away from computers for a few days. Just catching back up with email.
- J<
Spent a couple of seconds looking at this and have a couple of questions. Once I have answers it should take me only a couple of minutes to give you something to test.
Does the source URL for CRAN not have any kind of file extension?
Could you provide a couple of sample packages for me to look at and use in testing? I don't know much of anything about R.
- J<
Looking at this further, this URL scheme is just terrible and will be "fun" to make use of.
Basically you have to keep in mind that a tool like spectool can't trust the filename that is sent by the remote web server and will instead use only the filename extracted from the URL.
That means if you use something like this:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
you'll get a filename like
package=webp&version=0.4
(for a random package, R-webp, that I grabbed).
And that's not a useful filename; rpm won't unpack it.
What you have to do is the somewhat painful:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
Now, since we're going to hide this behind a macro, it's not the worst thing in the world. But it leaves questions:
* Is this guaranteed to continue to work in future? I don't think that the remote host gets the URL fragment identifier at all so I think it should be OK, but I haven't really tested that.
* Since we need to know the extension, can we expect tar.gz or are there packages with other archive formats?
So, I think we can deal but I don't think CRAN considered this point at all and that's unfortunate.
- J<
On Wed, 31 Oct 2018 at 18:25, Jason L Tibbitts III tibbs@math.uh.edu wrote:
Looking at this further, this URL scheme is just terrible and will be "fun" to make use of.
Basically you have to keep in mind that a tool like spectool can't trust the filename that is sent by the remote web server and will instead use only the filename extracted from the URL.
That means if you use something like this:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
you'll get a filename like
package=webp&version=0.4
(for a random package, R-webp, that I grabbed).
Correct.
And that's not a useful filename; rpm won't unpack it.
But that URL, for instance:
https://cran.r-project.org/package=simmer&version=3.0.0
returns a redirection (303) to the complete URL, with file extension.
What you have to do is the somewhat painful:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
Now, since we're going to hide this behind a macro, it's not the worst thing in the world. But it leaves questions:
- Is this guaranteed to continue to work in future? I don't think that the remote host gets the URL fragment identifier at all so I think it should be OK, but I haven't really tested that.
CRAN maintainers are pretty strict with this kind of stuff: if it works now, it's guaranteed to continue to work.
- Since we need to know the extension, can we expect tar.gz or are there packages with other archive formats?
There are no other formats: every package is tar.gz. But, as I pointed out above, the immutable URL is a redirection to the complete URL, so you can still extract the extension.
So, I think we can deal but I don't think CRAN considered this point at all and that's unfortunate.
- J<
I missed the example in my last email. Here it is:
$ curl -I "https://cran.r-project.org/package=simmer&version=3.0.0" HTTP/1.1 303 See Other Date: Wed, 31 Oct 2018 17:39:46 GMT Server: Apache/2.4.10 (Debian) Location: https://cran.r-project.org/src/contrib/Archive/simmer/simmer_3.0.0.tar.gz Content-Type: text/html; charset=iso-8859-1
That was an old version. The newest one:
$ curl -I "https://cran.r-project.org/package=simmer&version=4.0.1" HTTP/1.1 303 See Other Date: Wed, 31 Oct 2018 17:40:42 GMT Server: Apache/2.4.10 (Debian) Location: https://cran.r-project.org/src/contrib/simmer_4.0.1.tar.gz Content-Type: text/html; charset=iso-8859-1
On Wed, 31 Oct 2018 at 18:32, Iñaki Ucar iucar@fedoraproject.org wrote:
On Wed, 31 Oct 2018 at 18:25, Jason L Tibbitts III tibbs@math.uh.edu wrote:
Looking at this further, this URL scheme is just terrible and will be "fun" to make use of.
Basically you have to keep in mind that a tool like spectool can't trust the filename that is sent by the remote web server and will instead use only the filename extracted from the URL.
That means if you use something like this:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
you'll get a filename like
package=webp&version=0.4
(for a random package, R-webp, that I grabbed).
Correct.
And that's not a useful filename; rpm won't unpack it.
But that URL, for instance:
https://cran.r-project.org/package=simmer&version=3.0.0
returns a redirection (303) to the complete URL, with file extension.
What you have to do is the somewhat painful:
Source0: https://cran.r-project.org/package=%%7Bpackname%7D&version=%%7Bversion%7...
Now, since we're going to hide this behind a macro, it's not the worst thing in the world. But it leaves questions:
- Is this guaranteed to continue to work in future? I don't think that the remote host gets the URL fragment identifier at all so I think it should be OK, but I haven't really tested that.
CRAN maintainers are pretty strict with this kind of stuff: if it works now, it's guaranteed to continue to work.
- Since we need to know the extension, can we expect tar.gz or are there packages with other archive formats?
There are no other formats: every package is tar.gz. But, as I pointed out above, the immutable URL is a redirection to the complete URL, so you can still extract the extension.
So, I think we can deal but I don't think CRAN considered this point at all and that's unfortunate.
- J<
-- Iñaki Ucar
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> https://cran.r-project.org/package=simmer&version=3.0.0
IU> returns a redirection (303) to the complete URL, with file IU> extension.
303 is actually "See Other". Which is odd as that's usually sent in response to a PUT or POST, not a GET. Maybe you can get the files via POST as well; I'm not sure.
In any case, none of that has any effect on the filename that spectool (really curl) will use. It can't use data supplied by the remote host for that, for obvious reasons.
IU> CRAN maintainers are pretty strict with this kind of stuff: if it IU> works now, it's guaranteed to continue to work.
Well, that's good, but this is a hack so it's not a terrible idea to inform them that we have to use this kind of thing so that they can either bless this method or provide a cleaner one.
IU> There are no other formats: every package is tar.gz. But, as I IU> pointed out above, the immutable URL is a redirection to the IU> complete URL, so you can still extract the extension.
No, you can't. Not in the context and under the limitations where we're running. Certainly we can't know anything about that in an RPM macro as we have to provide an extension in a complete vacuum.
But given what you say, certainly defaulting to tar.gz will work for everything now.
Try dropping the below into /usr/lib/rpm/macros.d/macros.test (temporarily, make sure to deleted it when you're done experimenting). See if it gives the results you expect when you use %cran_url and %cran_source. Do fedpkg prep or some local builds. Try spectool -g and rpmspec -P.
Note that I've snuck a bit of magic in there which I'm not sure should be kept: If you don't have %packname defined and you call either %cran_url or %cran_source, then it will be automatically defined for you by stripping the leading "R-" from the name.
With this, you can just have:
Name: R-webp Version: 0.4 Release: 3%{?dist} Summary: A New Format for Lossless and Lossy Image Compression
License: MIT URL: %cran_url Source0: %cran_source
And use %packname in %prep as usual without explicitly defining it. But I'm not sure that much magic is a good idea.
(And while we're doing R macros, that package suggests that we should also have a macro defined to %_libdir/R/library....)
- J<
# Macros to replace overly complicated references to CRAN URLs and source files. # %cran_source - # Expands to the CRAN URL for a package # Accepts zero to three arguments: # 1: The CRAN project name, defaulting to %packname if it is defined. # If not, R- will be stripped from %name and %packname defined to that. # 2: The CRAN version, defaulting to %version. # 3: The file extension, defaulting to %__cran_default_extension (tar.gz). # Requires %__cran_package_url_template and %__cran_default_extension to be defined. # %__cran_package_url_template will undergo substitution (case-sensitive): # * "PACKNAME" will be replaced with the above CRAN project name. # * "PACKVERSION" will be replaced with the above CRAN version. # * "EXTENSION" will be replaced with the above extension. # # %cran_url - # Expands to the CRAN URL for a package # Accepts zero or one arguments: # 1: The CRAN project name, defaulting to %packname if it is defined. # If not, R- will be stripped from %name and %packname defined to that. # Requires %__cran_project_url_template to be defined. # %__cran_project_url_template will undergo substitution (case-sensitive): # * "PACKNAME" will be replaced with the above CRAN project name.
%__cran_project_url_template https://cran.r-project.org/package=PACKNAME %__cran_package_url_template %{__cran_project_url_template}&version=PACKVERSION#/PACKNAME_PACKVERSION.EXTENSION %__cran_default_extension tar.gz
%cran_source() %{lua: local src = rpm.expand('%1') local ver = rpm.expand('%2') local ext = rpm.expand('%3') local url = rpm.expand('%__cran_package_url_template') \ -- If no first argument, try %packname, then %name with 'R-' stripped. -- Note that rpm leaves macros unchanged if they are not defined. if src == '%1' then src = rpm.expand('%packname') end if src == '%packname' then src = string.gsub(rpm.expand('%name'), "^R%-", "") -- Since packname wasn't defined, define it for convenience. rpm.define("packname " .. src) end \ -- If no second argument, use %version if ver == '%2' then ver = rpm.expand('%version') end \ -- If no third argument, use the preset default extension if ext == '%3' then ext = rpm.expand('%__cran_default_extension') end \ -- Now substitute in all the values url = string.gsub(url, "PACKNAME", src) url = string.gsub(url, "PACKVERSION", ver) url = string.gsub(url, "EXTENSION", ext) \ print(url) }
%cran_url() %{lua: local src = rpm.expand('%1') local url = rpm.expand('%__cran_project_url_template') \ -- If no first argument, try %packname, then %name with 'R-' stripped. -- Note that rpm leaves macros unchanged if they are not defined. if src == '%1' then src = rpm.expand('%packname') end if src == '%packname' then src = string.gsub(rpm.expand('%name'), "^R%-", "") -- Since packname wasn't defined, define it for convenience. rpm.define("packname " .. src) end \ -- Substitute in the URL value url = string.gsub(url, "PACKNAME", src) \ print(url) }
On Wed, 31 Oct 2018 at 19:52, Jason L Tibbitts III tibbs@math.uh.edu wrote:
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> https://cran.r-project.org/package=simmer&version=3.0.0
IU> returns a redirection (303) to the complete URL, with file IU> extension.
303 is actually "See Other". Which is odd as that's usually sent in response to a PUT or POST, not a GET. Maybe you can get the files via POST as well; I'm not sure.
Why is this odd? It's not "moved", either permanently or temporarily (301, 302). It clearly matches the "see other" case. 303 was *primarily* motivated by the POST use case, but I think this is a pretty fair use.
In any case, none of that has any effect on the filename that spectool (really curl) will use. It can't use data supplied by the remote host for that, for obvious reasons.
IU> CRAN maintainers are pretty strict with this kind of stuff: if it IU> works now, it's guaranteed to continue to work.
Well, that's good, but this is a hack so it's not a terrible idea to inform them that we have to use this kind of thing so that they can either bless this method or provide a cleaner one.
I'll raise the issue in the R-devel mailing list and report back.
IU> There are no other formats: every package is tar.gz. But, as I IU> pointed out above, the immutable URL is a redirection to the IU> complete URL, so you can still extract the extension.
No, you can't. Not in the context and under the limitations where we're running. Certainly we can't know anything about that in an RPM macro as we have to provide an extension in a complete vacuum.
But given what you say, certainly defaulting to tar.gz will work for everything now.
For what it's worth, this (the extension) is clearly specified in the "Writing R Extensions" and the "R Installation and Administration" manuals. It always has been the same, and I'd say it's *extremely* unlikely to change.
Try dropping the below into /usr/lib/rpm/macros.d/macros.test (temporarily, make sure to deleted it when you're done experimenting). See if it gives the results you expect when you use %cran_url and %cran_source. Do fedpkg prep or some local builds. Try spectool -g and rpmspec -P.
Thanks, I'll give it a try and report back.
Note that I've snuck a bit of magic in there which I'm not sure should be kept: If you don't have %packname defined and you call either %cran_url or %cran_source, then it will be automatically defined for you by stripping the leading "R-" from the name.
This is great. However, in theory, given the naming guidelines, by stripping the leading "R-" you should get the package name. In practice, at least one package doesn't adhere to this: R-TH-data, while the R package name is TH.data, not TH-data. I see that the SPEC says "# Cannot use . in name", but this is clearly not true (maybe it was true long ago?).
With this, you can just have:
Name: R-webp Version: 0.4 Release: 3%{?dist} Summary: A New Format for Lossless and Lossy Image Compression
License: MIT URL: %cran_url Source0: %cran_source
And use %packname in %prep as usual without explicitly defining it. But I'm not sure that much magic is a good idea.
(And while we're doing R macros, that package suggests that we should also have a macro defined to %_libdir/R/library....)
That would require a good ton of magic. You have seen something like this:
%global rlibdir %{_libdir}/R/library
The thing is, this is the path for R packages *with* compiled code, while R packages *without* compiled code must go to %_datadir/R/library. That's why every R package has this global on top of the SPEC. Are you able to detect that and set the path appropriately with an RPM macro? :) That would certainly be very convenient for us, packagers.
- J<
# Macros to replace overly complicated references to CRAN URLs and source files. # %cran_source - # Expands to the CRAN URL for a package # Accepts zero to three arguments: # 1: The CRAN project name, defaulting to %packname if it is defined. # If not, R- will be stripped from %name and %packname defined to that. # 2: The CRAN version, defaulting to %version. # 3: The file extension, defaulting to %__cran_default_extension (tar.gz). # Requires %__cran_package_url_template and %__cran_default_extension to be defined. # %__cran_package_url_template will undergo substitution (case-sensitive): # * "PACKNAME" will be replaced with the above CRAN project name. # * "PACKVERSION" will be replaced with the above CRAN version. # * "EXTENSION" will be replaced with the above extension. # # %cran_url - # Expands to the CRAN URL for a package # Accepts zero or one arguments: # 1: The CRAN project name, defaulting to %packname if it is defined. # If not, R- will be stripped from %name and %packname defined to that. # Requires %__cran_project_url_template to be defined. # %__cran_project_url_template will undergo substitution (case-sensitive): # * "PACKNAME" will be replaced with the above CRAN project name.
%__cran_project_url_template https://cran.r-project.org/package=PACKNAME %__cran_package_url_template %{__cran_project_url_template}&version=PACKVERSION#/PACKNAME_PACKVERSION.EXTENSION %__cran_default_extension tar.gz
%cran_source() %{lua: local src = rpm.expand('%1') local ver = rpm.expand('%2') local ext = rpm.expand('%3') local url = rpm.expand('%__cran_package_url_template') \ -- If no first argument, try %packname, then %name with 'R-' stripped. -- Note that rpm leaves macros unchanged if they are not defined. if src == '%1' then src = rpm.expand('%packname') end if src == '%packname' then src = string.gsub(rpm.expand('%name'), "^R%-", "") -- Since packname wasn't defined, define it for convenience. rpm.define("packname " .. src) end \ -- If no second argument, use %version if ver == '%2' then ver = rpm.expand('%version') end \ -- If no third argument, use the preset default extension if ext == '%3' then ext = rpm.expand('%__cran_default_extension') end \ -- Now substitute in all the values url = string.gsub(url, "PACKNAME", src) url = string.gsub(url, "PACKVERSION", ver) url = string.gsub(url, "EXTENSION", ext) \ print(url) }
%cran_url() %{lua: local src = rpm.expand('%1') local url = rpm.expand('%__cran_project_url_template') \ -- If no first argument, try %packname, then %name with 'R-' stripped. -- Note that rpm leaves macros unchanged if they are not defined. if src == '%1' then src = rpm.expand('%packname') end if src == '%packname' then src = string.gsub(rpm.expand('%name'), "^R%-", "") -- Since packname wasn't defined, define it for convenience. rpm.define("packname " .. src) end \ -- Substitute in the URL value url = string.gsub(url, "PACKNAME", src) \ print(url) }
On Wed, 31 Oct 2018 at 21:22, Iñaki Ucar iucar@fedoraproject.org wrote:
while the R package name is TH.data, not TH-data. I see that the SPEC says "# Cannot use . in name", but this is clearly not true (maybe it was true long ago?).
Well... the guidelines for Python state: 'Note that when a module that has a dot in its name, the usual rule about changing "." to "-" applies.' But the guidelines for R say nothing about that, and as a result, most packages with a dot don't change it. Maybe R-TH-data is the only one that is compliant after all? :)
Le mercredi 31 octobre 2018 à 21:31 +0100, Iñaki Ucar a écrit :
On Wed, 31 Oct 2018 at 21:22, Iñaki Ucar iucar@fedoraproject.org wrote:
while the R package name is TH.data, not TH-data. I see that the SPEC says "# Cannot use . in name", but this is clearly not true (maybe it was true long ago?).
Well... the guidelines for Python state: 'Note that when a module that has a dot in its name, the usual rule about changing "." to "-" applies.' But the guidelines for R say nothing about that, and as a result, most packages with a dot don't change it. Maybe R-TH-data is the only one that is compliant after all? :)
And the guidelines also say to lowercase by default…
So my packagename computing macros just do this nowadays (aside from adding the correct prefix or suffix)
r = string.gsub(r, "[%p%s]+", "-") r = string.gsub(r, "^-", "") r = string.gsub(r, "-$", "") r = string.lower(r)
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This is great. However, in theory, given the naming guidelines, by IU> stripping the leading "R-" you should get the package name. In IU> practice, at least one package doesn't adhere to this: R-TH-data, IU> while the R package name is TH.data, not TH-data. I see that the SPEC IU> says "# Cannot use . in name", but this is clearly not true (maybe it IU> was true long ago?).
Why is that a problem? You would just define %packname in that case and nothing changes.
Look for the 50% case. Does it simplify at least half of the packages while not making things harder for the rest? I don't know the answer but I would be surprised if it wasn't 'yes' even if you change 50% to 90%.
IU> That would require a good ton of magic. You have seen something like IU> this:
IU> %global rlibdir %{_libdir}/R/library
IU> The thing is, this is the path for R packages *with* compiled code, IU> while R packages *without* compiled code must go to IU> %_datadir/R/library. That's why every R package has this global on IU> top of the SPEC.
Well, what you'd generally do is simply use a different macro for the noarch location and the arch-specific location. So you'd one defined macro for things under libdir, another macro for things under datadir. Like Perl and Python and such do.
If you really wanted to get down into it, it's tough to magically define a macro depending on BuildArch (though I could be missing a trick) but ou could conceivably have macros like %r_archful_package and %r_noarch_package.
They could some macro like your %rlibdir and also take care of adding the build dependencies and (where needed) the BuildArch: line and the R-core dependency. There's really a whole lot you could do, down to having macros used in %install, adding that annoying empty %build section, even generating a file list so you don't have to manually list so much in %files.
It depends on how far you want to go, and how specific you can be before you're not actually simplifying a majority of the R packages we have.
- J<
On Wed, 31 Oct 2018 at 21:48, Jason L Tibbitts III tibbs@math.uh.edu wrote:
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This is great. However, in theory, given the naming guidelines, by IU> stripping the leading "R-" you should get the package name. In IU> practice, at least one package doesn't adhere to this: R-TH-data, IU> while the R package name is TH.data, not TH-data. I see that the SPEC IU> says "# Cannot use . in name", but this is clearly not true (maybe it IU> was true long ago?).
Why is that a problem? You would just define %packname in that case and nothing changes.
Look for the 50% case. Does it simplify at least half of the packages while not making things harder for the rest? I don't know the answer but I would be surprised if it wasn't 'yes' even if you change 50% to 90%.
Don't get me wrong: I'm totally in with this change. I was just putting all the information on the table. And I'd be surprised if it wasn't "yes" for less than 95%. :)
IU> That would require a good ton of magic. You have seen something like IU> this:
IU> %global rlibdir %{_libdir}/R/library
IU> The thing is, this is the path for R packages *with* compiled code, IU> while R packages *without* compiled code must go to IU> %_datadir/R/library. That's why every R package has this global on IU> top of the SPEC.
Well, what you'd generally do is simply use a different macro for the noarch location and the arch-specific location. So you'd one defined macro for things under libdir, another macro for things under datadir. Like Perl and Python and such do.
If you really wanted to get down into it, it's tough to magically define a macro depending on BuildArch (though I could be missing a trick) but ou could conceivably have macros like %r_archful_package and %r_noarch_package.
They could some macro like your %rlibdir and also take care of adding the build dependencies and (where needed) the BuildArch: line and the R-core dependency. There's really a whole lot you could do, down to having macros used in %install, adding that annoying empty %build
That is quite annoying, yes. :)
section, even generating a file list so you don't have to manually list so much in %files.
It depends on how far you want to go, and how specific you can be before you're not actually simplifying a majority of the R packages we have.
- J<
On Wed, Oct 31, 2018, 5:01 PM Iñaki Ucar <iucar@fedoraproject.org wrote:
On Wed, 31 Oct 2018 at 21:48, Jason L Tibbitts III tibbs@math.uh.edu wrote:
> "IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> This is great. However, in theory, given the naming guidelines, by IU> stripping the leading "R-" you should get the package name. In IU> practice, at least one package doesn't adhere to this: R-TH-data, IU> while the R package name is TH.data, not TH-data. I see that the SPEC IU> says "# Cannot use . in name", but this is clearly not true (maybe it IU> was true long ago?).
Why is that a problem? You would just define %packname in that case and nothing changes.
Look for the 50% case. Does it simplify at least half of the packages while not making things harder for the rest? I don't know the answer but I would be surprised if it wasn't 'yes' even if you change 50% to 90%.
Don't get me wrong: I'm totally in with this change. I was just putting all the information on the table. And I'd be surprised if it wasn't "yes" for less than 95%. :)
All my packages were created using R2spec, so they would all use %{packname}. I would guess that most packages used the generator as well.
IU> That would require a good ton of magic. You have seen something like
IU> this:
IU> %global rlibdir %{_libdir}/R/library
IU> The thing is, this is the path for R packages *with* compiled code, IU> while R packages *without* compiled code must go to IU> %_datadir/R/library. That's why every R package has this global on IU> top of the SPEC.
This can be determined from the NeedsCompilation key in the DESCRIPTION file, which is what (my fork of) R2spec does.
Well, what you'd generally do is simply use a different macro for the
noarch location and the arch-specific location. So you'd one defined macro for things under libdir, another macro for things under datadir. Like Perl and Python and such do.
If you really wanted to get down into it, it's tough to magically define a macro depending on BuildArch (though I could be missing a trick) but ou could conceivably have macros like %r_archful_package and %r_noarch_package.
They could some macro like your %rlibdir and also take care of adding the build dependencies and (where needed) the BuildArch: line and the R-core dependency. There's really a whole lot you could do, down to having macros used in %install, adding that annoying empty %build
That is quite annoying, yes. :)
section, even generating a file list so you don't have to manually list so much in %files.
Basically anything under %{rlibdir}%{packname} is needed and nothing appears elsewhere. If we didn't need to mark doc and license files, we could have just specified that top-level directory. The hard part about that is the naming is pretty inconsistent. Because CRAN disallows providing the full license text in LICENSE, upstreams that want to follow the terms of the license will put it in a separate file and the name is never the same.
It depends on how far you want to go, and how specific you can be before
you're not actually simplifying a majority of the R packages we have.
In the same way Go has moved to mostly generated, I think R can mostly get away with this as well. We can get most of the information from the DESCRIPTION file or re-write R2spec to output what we need. I don't really know how macros work or I might have attempted this earlier.
- J<
-- Iñaki Ucar
On Thu, 1 Nov 2018 at 05:05, Elliott Sales de Andrade quantum.analyst@gmail.com wrote:
Basically anything under %{rlibdir}%{packname} is needed and nothing appears elsewhere. If we didn't need to mark doc and license files, we could have just specified that top-level directory. The hard part about that is the naming is pretty inconsistent. Because CRAN disallows providing the full license text in LICENSE, upstreams that want to follow the terms of the license will put it in a separate file and the name is never the same.
This is not correct. The naming is very consistent and it's clearly specified in the "Writing R Extensions" manual (section 1.1 Package structure). Most licenses (such as GPL) must not be bundled in the sources. Some of them (such as MIT) require a file with the year and the name of the copyright holders, and this file must be called LICENSE or LICENCE.
If there are other copyright holders different from the authors, and in general if the authoring information is too complex, files AUTHORS and COPYRIGHTS may be present.
Anyway, the presence of these files must be specified in the DESCRIPTION file (LICENSE/LICENCE -> License field; AUTHORS -> Author field; COPYRIGHTS -> Copyright field), see, e.g., the XML and bookdown packages.
"JLT" == Jason L Tibbitts tibbs@math.uh.edu writes:
JLT> It depends on how far you want to go, and how specific you can be JLT> before you're not actually simplifying a majority of the R packages JLT> we have.
Just for fun I took the R-webp package and constructed a macro that generates most of the package. Obviously it's somewhat bound to this specific package, but it appears that most R packages have rather similar boilerplate so it might be more generally useful. It's just a hack but it seems like it might be interesting to explore this a bit. The technique should still apply to package sets which have significant uniformity.
The spec looks like: ----- Name: R-webp Version: 0.4 Release: 3%{?dist} Summary: A New Format for Lossless and Lossy Image Compression
License: MIT URL: %cran_url Source0: %cran_source
Suggests: R-jpeg R-png
BuildRequires: libwebp-devel BuildRequires: R-jpeg R-png
%description Lossless webp images are 26% smaller in size compared to PNG. Lossy webp images are 25-34% smaller in size compared to JPEG. This package reads and writes webp images into a 3 (rgb) or 4 (rgba) channel bitmap array using conventions from the 'jpeg' and 'png' packages.
%r_simple_archful_package
%files -f %packname.files
%changelog -----
This builds fine and produces the same output as the current spec in git. I could go even further and not have the %files section at all, really.
I don't really know how useful this would be. I know (from another package I glanced at) that applying patches seems to be a bit weird because of the directory structure, so maybe using %autosetup isn't the right way to go. But I could still easily duplicate the patch application part of %autosetup. And the auto file-list generation stuff could easily grow smarts to handle other cases.
The macros, which you can drop temporarily in /usr/lib/rpm/macros.d for testing, follow. These aren't "complete" and you still need the other macros I posted earlier. You can cram them together in one file if you like.
- J<
# The basic sections %r_prep %{lua:\ local packname = rpm.expand('%packname') local autosetup = rpm.expand("%{autosetup -c -n " .. packname .. "}") \ print(autosetup .. "\\n") }
%r_install %{lua:\ local rlibdir = rpm.expand('%rlibdir') local bindir = rpm.expand('%_bindir') local packname = rpm.expand('%packname') local buildroot = rpm.expand('%buildroot') \ -- A function to simplify adding to the file list local function add_file(file, type) print("echo '") if (type ~= nil) then print("%" .. type .. " ") end print(rlibdir .. "/" .. packname) if (file ~= nil) then print("/" .. file) end print("' >> " .. packname .. ".files\\n") end \ print("mkdir -p " .. buildroot .. rlibdir .. "\\n") print(bindir .. "/R CMD INSTALL -l " .. buildroot .. rlibdir .. " " .. packname .. "\\n") print("test -d " .. packname .. "/src && (cd " .. packname .. "/src; rm -f *.o *.so)\\n") print("rm -f " .. buildroot .. rlibdir .. "/R.css\\n") \ -- R packages have a somewhat regularlized file structure -- This could be pushed out to a shell script and called like %find_lang is -- called. But for now we'll just inline the code. print("\\n# Generate a default file list\\n") add_file(nil, "dir") add_file("html/", "doc") add_file("libs", "dir") add_file("libs/" .. packname .. ".so") add_file("DESCRIPTION") add_file("LICENSE", "license") add_file("NEWS", "doc") add_file("INDEX") add_file("NAMESPACE") add_file("Meta/") add_file("R/") add_file("help/") \ print("ls -lR " .. buildroot .. "\\n") }
%r_check %{lua:\ local bindir = rpm.expand('%_bindir') local packname = rpm.expand('%packname') \ print(bindir .. "/R CMD check " .. packname .. "\\n") }
# For noarch packages %r_noarch_package %{lua:\ print("BuildArch: noarch\\n") print("Requires: R-core\\n") print("BuildRequires: R-devel tex(latex)\\n") rpm.define("rlibdir %{_datadir}/R/library") print("%build\\n\\n") }
%r_archful_package %{lua:\ print("BuildRequires: R-devel tex(latex)\\n") rpm.define("rlibdir %{_libdir}/R/library") print("%build\\n\\n") }
%r_simple_archful_package %{lua:\ print(rpm.expand("%r_archful_package")) \ local prep = rpm.expand("%r_prep") print("%prep\\n") print(prep .. "\\n\\n") \ -- Hack around bizarre redefinition of %install. local pre_install = rpm.expand('%{?_enable_debug_packages:%{debug_package}}') print(pre_install .. "\\n") \ local install = rpm.expand('%r_install') print("%install\\n") print(install .. "\\n\\n") \ local check = rpm.expand('%r_check') print("%check\\n") print(check .. "\\n\\n") \ -- We could even add the files section here, but this needs magic to -- determine packname since it hasn't yet been defined. -- print("%files -f " .. packname .. ".files\n") }
And with some tweaks to the macro file (current version at https://www.math.uh.edu/~tibbs/fedora/macros.test) and the R-uuid package is reduced to the following. Not quite as simple, but it shows how you can still split out the individual sections when you need to add something, and also cope with a difficult version number. (I switched the argument order around for %cran_source.)
Sadly you still need %build if not using %r_simple_archful_package. I could explain why but....
Name: R-uuid Version: 0.1.2 Release: 6%{?dist} Summary: Tools for generating and handling of UUIDs
License: MIT URL: %cran_url Source0: %cran_source 0.1-2
BuildRequires: libuuid-devel
%description Tools for generating and handling of UUIDs (Universally Unique Identifiers).
%r_archful_package
%prep %r_prep
pushd %{packname} rm configure.ac configure src/Makevars.in src/[a-z]*.[ch] sed -i -e '/configure/d' -e '/Makevars/d' -e '/src/[a-z].*.[ch]/d' MD5 rm -r src/config.h.in src/win32 sed -i -e '/config.h/d' MD5 cat > src/Makevars << EOF PKG_CFLAGS = $(shell pkg-config --cflags uuid) PKG_LIBS = $(shell pkg-config --libs uuid) EOF popd
%build
%install %r_install
%check %r_check
%files -f %packname.files
%changelog
All this would be very useful. CRAN guidelines are pretty strict, and they are enforced by R packaging and checking tools, and additionally upon submission to CRAN, with many automatic checks. As a result, and as you say, most SPEC sections in most packages can be fully automated, and I think we should tend to that, with proper mechanisms to treat special cases.
Let me add to your findings from my experience as both R and Fedora package maintainer:
- Version: As you already found out, there are many version formats allowed. The only special cases are those using a hyphen, such as R-uuid, that needs to be substituted with a dot. Other formats may contain two or three components, year and month... but they don't require special treatment if there are no hyphens.
- Arch, Requires, BuildRequires, libdir/datadir: As Elliott pointed out already, archful packages can be detected just by looking into the DESCRIPTION file, by checking whether the NeedsCompilation flag exists. If it does, the package must go to libdir and R-devel must be added as BuildRequires. Furthermore, the LinkingTo field in the DESCRIPTION file lists packages that must be added as BuildRequires, such as R-Rcpp-devel. If NeedsCompilation does not exist, the package must go to datadir, noarch needs to be specified and R-core must be added as Requires. Further Requires usually are a little bit trickier to automate and should be managed by us.
- Package devel: This could be automated also. LinkingTo other R packages is possible if headers are listed under inst/include (in the sources; the installation path is %{rlibdir}/%{packname}/include) in those packages. So the R-%packname-devel package must be defined if such directory exists. There are two flavors for this: packages that NeedsCompilation, so they are archful, and packages that don't. The latter are devel-only (see, e.g., R-BH).
- %prep, %build, %install, %check, %files: These can be automated too. %build (void) and %install are always the same. %prep and %check may require some tuning (patching, unbundling; defining some exports and setting some flags in the "R CMD check" command). %files are pretty standard: there is a basic set, and then other may or may not be there, such as LICENSE/LICENCE, NEWS/NEWS.md, README/README.md, Changelog, and CITATION; and directories such as libs and include.
I'm not familiar at all with RPM macros, but I can help with all these details.
On Thu, 1 Nov 2018 at 03:22, Jason L Tibbitts III tibbs@math.uh.edu wrote:
And with some tweaks to the macro file (current version at https://www.math.uh.edu/~tibbs/fedora/macros.test) and the R-uuid package is reduced to the following. Not quite as simple, but it shows how you can still split out the individual sections when you need to add something, and also cope with a difficult version number. (I switched the argument order around for %cran_source.)
Sadly you still need %build if not using %r_simple_archful_package. I could explain why but....
Name: R-uuid Version: 0.1.2 Release: 6%{?dist} Summary: Tools for generating and handling of UUIDs
License: MIT URL: %cran_url Source0: %cran_source 0.1-2
BuildRequires: libuuid-devel
%description Tools for generating and handling of UUIDs (Universally Unique Identifiers).
%r_archful_package
%prep %r_prep
pushd %{packname} rm configure.ac configure src/Makevars.in src/[a-z]*.[ch] sed -i -e '/configure/d' -e '/Makevars/d' -e '/src/[a-z].*.[ch]/d' MD5 rm -r src/config.h.in src/win32 sed -i -e '/config.h/d' MD5 cat > src/Makevars << EOF PKG_CFLAGS = $(shell pkg-config --cflags uuid) PKG_LIBS = $(shell pkg-config --libs uuid) EOF popd
%build
%install %r_install
%check %r_check
%files -f %packname.files
%changelog
Le jeudi 01 novembre 2018 à 13:36 +0100, Iñaki Ucar a écrit :
Hi,
- Arch, Requires, BuildRequires, libdir/datadir: As Elliott pointed
out already, archful packages can be detected just by looking into the DESCRIPTION file, by checking whether the NeedsCompilation flag exists. If it does, the package must go to libdir and R-devel must be added as BuildRequires. Furthermore, the LinkingTo field in the DESCRIPTION file lists packages that must be added as BuildRequires,
And that makes it yet another language that needs https://github.com/rpm-software-management/rpm/issues/104 and the functional part of https://pagure.io/fesco/issue/2004 to be automated properly.
%files are pretty standard: there is a basic set, and then other may or may not be there, such as LICENSE/LICENCE, NEWS/NEWS.md, README/README.md, Changelog, and CITATION; and directories such as libs and include.
Take a look at https://copr.fedorainfracloud.org/coprs/nim/fontpackages-2/builds/
for an example on how to pass doc wildcards to generic automation macros.
(currently polishing it and cross-pollinating with go macros)
Regards,
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> - Version: As you already found out, there are many version formats IU> allowed.
But it's easy to work from Version: in the normal case and simply allow the CRAN version to be specified separately, as I showed in the R-uuid sample. If it's rather more common that updating the version in two places is a burden, we can certainly look at other ways to do it.
IU> - Arch, Requires, BuildRequires, libdir/datadir: As Elliott pointed IU> out already, archful packages can be detected just by looking into IU> the DESCRIPTION file, by checking whether the NeedsCompilation flag IU> exists.
As nim pointed out in a previous message, it's "not possible" to make use of this information for many things as long as DESCRIPTION is within the tarball, because unpacking does not happen until well after specfile parsing (and thus macro expansion) has happened. (Technically all things are possible but there isn't too much you can do with a source file before it's even been downloaded, and we certainly have to be able to parse the spec before that point.)
There are of course some things you can do. The %prep and %install scriptlets, for example, are just shell code which can be inserted by macros but which actually runs much later. So you can use that information to figure out how to build, and what files to install. But you can't look in there to add BuildArch: noarch or add build dependencies because those have to be done much earlier.
So that's why I just provided separate macros for the archful and noarch cases.
IU> - %prep, %build, %install, %check, %files: These can be automated IU> too.
Well, look at the two examples I provided. The %files list isn't completely consistent (one package has COPYING and the other doesn't) but the test macro file I provided conditionalizes this properly.
IU> I'm not familiar at all with RPM macros, but I can help with all IU> these details.
So then perhaps have a look at those examples I provided and see how they would actually work in real life on more than the two specs I converted. If I get time today I'll fill things in for the noarch package case.
- J<
On Thu, 1 Nov 2018 at 15:45, Jason L Tibbitts III tibbs@math.uh.edu wrote:
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> - Version: As you already found out, there are many version formats IU> allowed.
But it's easy to work from Version: in the normal case and simply allow the CRAN version to be specified separately, as I showed in the R-uuid sample. If it's rather more common that updating the version in two places is a burden, we can certainly look at other ways to do it.
Your example is fine. I was, again, just completing the picture to show that your solution is enough to treat every case.
IU> - Arch, Requires, BuildRequires, libdir/datadir: As Elliott pointed IU> out already, archful packages can be detected just by looking into IU> the DESCRIPTION file, by checking whether the NeedsCompilation flag IU> exists.
As nim pointed out in a previous message, it's "not possible" to make use of this information for many things as long as DESCRIPTION is within the tarball, because unpacking does not happen until well after specfile parsing (and thus macro expansion) has happened. (Technically all things are possible but there isn't too much you can do with a source file before it's even been downloaded, and we certainly have to be able to parse the spec before that point.)
There are of course some things you can do. The %prep and %install scriptlets, for example, are just shell code which can be inserted by macros but which actually runs much later. So you can use that information to figure out how to build, and what files to install. But you can't look in there to add BuildArch: noarch or add build dependencies because those have to be done much earlier.
So that's why I just provided separate macros for the archful and noarch cases.
I see. Anyway, I suppose that it's healthy to preserve some manual intervention in these sections.
IU> - %prep, %build, %install, %check, %files: These can be automated IU> too.
Well, look at the two examples I provided. The %files list isn't completely consistent (one package has COPYING and the other doesn't) but the test macro file I provided conditionalizes this properly.
From "Writing R Extensions":
"Whereas you should feel free to include a license file in your source distribution, please do not arrange to install yet another copy of the GNU COPYING or COPYING.LIB files but refer to the copies on https://www.R-project.org/Licenses/ and included in the R distribution (in directory share/licenses)".
Therefore, that file shouldn't be there and should be removed (actually, it's funny because uuid's maintainer is a member of R-Core, and we usually get our submissions to CRAN rejected due to smaller details ¯_(ツ)_/¯).
IU> I'm not familiar at all with RPM macros, but I can help with all IU> these details.
So then perhaps have a look at those examples I provided and see how they would actually work in real life on more than the two specs I converted. If I get time today I'll fill things in for the noarch package case.
You can count on that. I'll try to identify corner cases to see how your macros behave in such packages and report back as time permits.
- J<
"IU" == Iñaki Ucar iucar@fedoraproject.org writes:
IU> I see. Anyway, I suppose that it's healthy to preserve some manual IU> intervention in these sections.
Well, it would be super great if we didn't have to do that and one day RPM might give us some reasonable way to generate more of the specfile based on the contents of the source files. See https://github.com/rpm-software-management/rpm/issues/104 But even if we get that, you won't be able to use it on anything but rawhide for some time.
IU> You can count on that. I'll try to identify corner cases to see how IU> your macros behave in such packages and report back as time permits.
I'm certainly happy to develop them in tandem with your investigations. I have no expectation that what I banged out yesterday is remotely complete, but it at least shows that for at least a couple of packages you can hide rather a lot of the details behind macros.
I think the next step is for me to stuff those macros in a temporary pagure repo. We can track what's missing with tickets and then eventually get the macros packaged up so that they can be used in the buildsystem.
- J<
Since I actually had an existing pagure repo for random RPM macro experiments, I just dropped the R macro stuff there.
https://pagure.io/misc-rpm-macros https://pagure.io/misc-rpm-macros/blob/master/f/macros.R-extra
I still have some ideas to implement but feel free to test what's there. To use this, you just need to check out the repo and then add a symlink to macros.R-extra in /usr/lib/rpm/macros.d.
- J<
Hello,
On Thu, 1 Nov 2018 at 20:45, Jason L Tibbitts III tibbs@math.uh.edu wrote:
Since I actually had an existing pagure repo for random RPM macro experiments, I just dropped the R macro stuff there.
https://pagure.io/misc-rpm-macros https://pagure.io/misc-rpm-macros/blob/master/f/macros.R-extra
I still have some ideas to implement but feel free to test what's there. To use this, you just need to check out the repo and then add a symlink to macros.R-extra in /usr/lib/rpm/macros.d.
- J<
This is a rather old email thread that I'm resurrecting, but I'm wondering what the state of those macros is now?
We now have a place in upstream rpm where we could store these macros: https://github.com/rpm-software-management/R-rpm-macros It is currently used for automated dependency generators only.
Hi,
I had every intention of pushing this forward, but I never found the time. Sorry for that. It would still be great to have these.
Iñaki
On Wed, 6 Nov 2019 at 10:11, Elliott Sales de Andrade quantum.analyst@gmail.com wrote:
Hello,
On Thu, 1 Nov 2018 at 20:45, Jason L Tibbitts III tibbs@math.uh.edu wrote:
Since I actually had an existing pagure repo for random RPM macro experiments, I just dropped the R macro stuff there.
https://pagure.io/misc-rpm-macros https://pagure.io/misc-rpm-macros/blob/master/f/macros.R-extra
I still have some ideas to implement but feel free to test what's there. To use this, you just need to check out the repo and then add a symlink to macros.R-extra in /usr/lib/rpm/macros.d.
- J<
This is a rather old email thread that I'm resurrecting, but I'm wondering what the state of those macros is now?
We now have a place in upstream rpm where we could store these macros: https://github.com/rpm-software-management/R-rpm-macros It is currently used for automated dependency generators only.
-- Elliott
devel@lists.stg.fedoraproject.org