Skip to content

Support parsing streams#2534

Merged
kddnewton merged 1 commit intomainfrom
stream-parsing
Mar 7, 2024
Merged

Support parsing streams#2534
kddnewton merged 1 commit intomainfrom
stream-parsing

Conversation

@kddnewton
Copy link
Copy Markdown
Collaborator

@kddnewton kddnewton commented Mar 1, 2024

Closes #1247 (our second-oldest open issue!)

Prism can now parse streams. In the C API you can use it like this:

#include "prism.h"

int main(void) {
    pm_parser_t parser;
    pm_options_t options = { .line = 1, 0 };
    pm_node_t *node = pm_parse_stream(&parser, stdin, (pm_parse_stream_fgets_t *) fgets, &options);

    pm_buffer_t buffer;
    pm_prettyprint(&buffer, &parser, node);
    printf("%.*s\n", (int) pm_buffer_length(&buffer), pm_buffer_value(&buffer));

    pm_node_destroy(&parser, node);
    pm_parser_free(&parser);
    pm_buffer_free(&buffer);
}

Relevant signatures are:

typedef char * (pm_parse_stream_fgets_t)(char *restrict string, int size, void *restrict stream);

PRISM_EXPORTED_FUNCTION pm_node_t *
pm_parse_stream(pm_parser_t *parser, void *stream, pm_parse_stream_fgets_t *fgets, const pm_options_t *options);

We never touch the stream itself, so it's treated as an opaque pointer. It is the responsibility of the given function pointer to load one line at a time into its first parameter. From the Ruby API you can call it like:

Prism.parse_stream(stream)

where stream is any object that responds to gets(length).

This is suitable for implementing streaming from stdin into the prism parser, while appropriately stopping the stream at the point that it hits an __END__ marker, as in:

stream = StringIO.new("foo\n__END__\nbar")
Prism.parse_stream(stream)
stream.read # => "bar"

cc @enebo @eregon @seven1m

@kddnewton kddnewton force-pushed the stream-parsing branch 2 times, most recently from 89228a7 to 7e4766f Compare March 1, 2024 19:33
@enebo
Copy link
Copy Markdown
Collaborator

enebo commented Mar 1, 2024

@kddnewton I need to see if jffi and project Panama are capable of passing a pointer to a function which returns what you want. Others will want this I am sure but I am not sure if I can use it at this point or not.

@kddnewton
Copy link
Copy Markdown
Collaborator Author

@enebo any chance you can pass stdin and fgets directly? (That was kind of my hope in making the function pointer have the same signature as fgets.).

@kddnewton
Copy link
Copy Markdown
Collaborator Author

I could also provide a version of this that assumes stdin and fgets?

@enebo
Copy link
Copy Markdown
Collaborator

enebo commented Mar 1, 2024

@kddnewton It looks like I can make a function in JNR. Sometimes this is hit or miss but I will try and figure this out next week. So long as I can make the right signature this should work out ok. stdin+fgets might be possible too but I need to still bind it using JNR (JNR is in Java and not C)

@enebo
Copy link
Copy Markdown
Collaborator

enebo commented Mar 1, 2024

Err I misread your second suggestion but let's assume I can make a proper pointer to function and see where that goes.

@eregon
Copy link
Copy Markdown
Member

eregon commented Mar 2, 2024

I could also provide a version of this that assumes stdin and fgets?

That would be great and would be quite a bit easier to call.

In fact is there a use-case for using this for something different than stdin?
If not it would be quite a bit simpler notably in the ffi backend if there is no need for callbacks.
Also callbacks are very slow with the FFI gem/libffi/JNI so are best avoided performance-wise (>100ns overhead per upcall IIRC).
In fact it's so bad I would prefer if such an API with a custom Ruby IO is not exposed, as it can't be implemented efficiently with FFI (it's likely also not that great with the C extension with one rb_funcall(io, "gets") call per line if many lines).

@kddnewton kddnewton force-pushed the stream-parsing branch 2 times, most recently from 61feb6b to b61ab29 Compare March 7, 2024 20:24
@kddnewton kddnewton merged commit 26d47de into main Mar 7, 2024
@kddnewton kddnewton deleted the stream-parsing branch March 7, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Serialization of piped Ruby source code

3 participants