Module Pxp_ev_parser


module Pxp_ev_parser: sig .. end

val create_entity_manager : ?is_document:bool ->
Pxp_types.config -> Pxp_dtd.source -> Pxp_entity_manager.entity_manager
val process_entity : Pxp_types.config ->
Pxp_types.entry ->
Pxp_entity_manager.entity_manager -> (Pxp_types.event -> unit) -> unit
Parses a document or a document fragment. At least the well-formedness of the document is checked, but the flags of the entry argument may specify more.

While parsing, events are generated and the passed function is called for every event. The parsed text is read from the current entity of the entity manager. It is allowed that the current entity is open or closed.

The entry point to the parsing rules can be specified. Notes to entry points:

The entry points have options, see Pxp_types for explanations.

The generated events are not normalized with respect to:

Only the following config options have an effect: If an error happens, the callback function is invoked exactly once with the E_error event. The error is additionally passed to the caller by letting the exception fall through to the caller. It is not possible to resume parsing after an error.

The idea behind this special error handling is that the callback function should always be notified when the parser stops, no matter whether it is successful or not. So the last event passed to the callback function is either E_end_of_stream or E_error. You can imagine that process_entity follows this scheme:

try "parse"; eh E_end_of_stream (* eh is the callback function *) with error -> "cleanup"; let pos = ... in let e = At(pos, error) in eh (E_error e); raise e

Note that there is always an At(_,_) exception that wraps the exception that originally occurred. - This style of exception handling applies to exceptions generated by the parser as well as to exceptions raised by the callback function.

val process_expr : ?first_token:Pxp_lexer_types.token ->
?following_token:Pxp_lexer_types.token Pervasives.ref ->
Pxp_types.config ->
Pxp_entity_manager.entity_manager -> (Pxp_types.event -> unit) -> unit
This is a special parsing function that corresponds to the entry Entry_expr, i.e. it parses a single element, processing instruction, or comment. In contrast to process_entity, the current entity is not opened, but it is expected that the entity is already open. Of course, the entity is not closed after parsing (except an error happens).

~first_token: This token is prepended to the tokens read from the entity manager. ~following_token: The token following the last parsed token is optionally stored into this variable. Note: By design the parser _always_ reads the following token. I know that this may lead to serious problems when it is tried to integrate this parser with another parser. It is currently hard to change!

val create_pull_parser : Pxp_types.config ->
Pxp_types.entry ->
Pxp_entity_manager.entity_manager -> unit -> Pxp_types.event option
let next_event = create_pull_parser cfg entry mng in let ev = next_event()

This function parses the XML document in "pull mode". next_event should be invoked repeatedly until it returns None, indicating the end of the document. The events are encoded as Some ev.

The function returns exactly the same events as process_entity.

In contrast to process_entity, no exception is raised when an error happens. Only the E_error event is generated (as last event).

To create a stream of events, just do: let next = create_pull_parser cfg entry mng in let stream = Stream.from(fun _ -> next())


Filters have been moved to Pxp_event!

For conversions from trees to event streams, and vice versa, see Pxp_document.