docs: add migration guides and tutorial#999
Conversation
ev-br
left a comment
There was a problem hiding this comment.
This is very nice.
I've a left several comments, mostly very minor. There are two themes:
- for the migration guide, I think it would be helpful to be some more specific and some more opinionated.
- In the tutorial, I love the power iteration example. The only concern is that you talk about JIT, but the example does not jit, not easily at least. So maybe either expand a bit on how to actually jit it, or state that not everything is benefits from jitting and add one more example which does?
Co-authored-by: Athan <kgryte@gmail.com>
Co-authored-by: Athan <kgryte@gmail.com>
lucascolley
left a comment
There was a problem hiding this comment.
Sorry, I missed this PR. Apologies for the post-merge review, unfortunate that this was merged just before the meeting.
I've only had time to look through the migration guide so far, see below for suggestions.
|
|
||
| # Migration Guide | ||
|
|
||
| This page is meant to help migrate your codebase to an Array API compliant |
There was a problem hiding this comment.
here and throughout (maybe dropping 'standard' makes sense in some cases, but see #778 )
| This page is meant to help migrate your codebase to an Array API compliant | |
| This page is meant to help migrate your codebase to an array API standard compliant |
| exact use-case, you should look thoroughly into at least one of them. | ||
|
|
||
| The first part is dedicated for {ref}`array-producers`. If your library | ||
| mimics, for example, NumPy's or Dask's functionality, then you can find in |
There was a problem hiding this comment.
minor nit, Dask is probably a strange choice here since itself is quite firmly mimicking NumPy. Maybe PyTorch would be a better example of where the standard took influence to differ from (historical) NumPy
| implementation. The guide is divided into two parts and, depending on your | ||
| exact use-case, you should look thoroughly into at least one of them. |
There was a problem hiding this comment.
nit, seems unnecessary to me
| implementation. The guide is divided into two parts and, depending on your | |
| exact use-case, you should look thoroughly into at least one of them. | |
| implementation. The guide is divided into two parts. |
| part to learn how to make it library agnostic and interchange array | ||
| namespaces with significantly less friction. |
There was a problem hiding this comment.
is "interchange array namespaces" really what the second part is helping with? To me it seems focused on making functions agnostic
| # Migration Guide | ||
|
|
||
| This page is meant to help migrate your codebase to an Array API compliant | ||
| implementation. The guide is divided into two parts and, depending on your |
There was a problem hiding this comment.
I think this should be changed since there are actually three parts, the first being Ecosystem
| consumers. It includes additional array manipulation and statistical functions. | ||
| It is already used by SciPy and scikit-learn. | ||
|
|
||
| The sections below mention when and how to use them. |
There was a problem hiding this comment.
I would move this up above the individual library sections otherwise it looks like this is under the array-api-extra section
| We strongly advise you to embed this setup in your CI as well. This will allow | ||
| you to continuously monitor Array API coverage, and make sure new changes don't break existing | ||
| APIs. As a reference, see [NumPy's Array API Tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296). | ||
|
|
There was a problem hiding this comment.
If I was a library developer reading this, I would be wishing there was also a link to how to set this up within a Pixi workspace, like https://github.com/mdhaber/mparray/blob/0ef47e008fef92c605f73907436d4c6617419161/pixi.toml#L119-L179
| For array consumers, the main premise is to keep in mind that your **array | ||
| manipulation operations should not lock in for a particular array producing | ||
| library**. For instance, if you use NumPy for arrays, then your code could | ||
| contain: |
There was a problem hiding this comment.
| For array consumers, the main premise is to keep in mind that your **array | |
| manipulation operations should not lock in for a particular array producing | |
| library**. For instance, if you use NumPy for arrays, then your code could | |
| contain: | |
| For array consumers, the main premise is that your **array | |
| manipulation operations should not be specific to one particular array producing | |
| library**. For instance, if your code is specific to NumPy, it might contain: |
| - If you are building a library where the backend is determined by input arrays, | ||
| and your function accepts array arguments, then a recommended way is to ask | ||
| your input arrays for a namespace to use: `xp = arr.__array_namespace__()`. | ||
| If the given library doesn't have it, then [`array_api_compat.array_namespace()`](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace) | ||
| should be used instead: |
There was a problem hiding this comment.
I think we should skip the idealism and just recommend array_namespace() for now until there are real use-cases with __array_namespace__()
| def func(s1, s2, xp): | ||
| return xp.arange(s1, s2) |
There was a problem hiding this comment.
this example isn't particularly compelling, since it is something that is already achievable more easily on the user-side. I think it is worth a sentence stating when this may be worth it (e.g. there may be significant computation that you want to happen native to the array library before returning, https://docs.scipy.org/doc/scipy/reference/generated/scipy.fft.fftfreq.html) versus when it probably isn't (e.g. just trivially wrapping a value with xp.asarray before return, https://docs.scipy.org/doc/scipy/reference/constants.html)
|
@lucascolley Thanks for reviewing! I'll open another PR for it. |
|
FTR, nowhere are we consistent with the naming convention. |
indeed, hence why I'd like to get more consistent with the place where we thought about it 😅 |
|
Personally, I think the lack of capitalization (array API standard) was a mistake, as it is more confusing for a proper noun to be lowercase. |
This PR adds a migration guide (versions for array consumers and producers) and one migration tutorial showing how a simple power-iteration based algorithm from GraphBLAS can be moved to an Array API compatible version.