Class type Pxp_reader.resolver


class type resolver = object .. end

method init_rep_encoding : Pxp_core_types.rep_encoding -> unit
A resolver can open an input source, and returns this source as Lexing.lexbuf.

After creating a resolver, one must invoke the two methods init_rep_encoding and init_warner to set the internal encoding of strings and the warner object, respectively. This is normally done by the parsing functions in Pxp_yacc. It is not necessary to invoke these two methods for a fresh clone.

It is possible that the character encoding of the source and the internal encoding of the parser are different. To cope with this, one of the tasks of the resolver is to recode the characters of the input source into the internal character encoding.

Note that there are several ways of determining the encoding of the input: (1) It is possible that the transport protocol (e.g. HTTP) transmits the encoding, and (2) it is possible to inspect the beginning of the file, and to analyze: (2.1) The first two bytes indicate whether UTF-16 is used (2.2) Otherwise, one can assume that an ASCII-compatible character set is used. It is now possible to read the XML declaration <?xml ... encoding="xyz" ...?>. The encoding found here is to be used. (2.3) If the XML declaration is missing, the encoding is UTF-8. The resolver needs only to distinguish between cases (1), (2.1), and the rest. The details of analyzing whether (2.2) or (2.3) applies are programmed elsewhere, and the resolver will be told the result (see below).

A resolver is like a file: it must be opened before one can work with it, and it should be closed after all operations on it have been done. The method 'open_rid' is called with the resolver ID as argument and it must return the lexbuf reading from the external resource. (There is also the old method 'open_in' that expects an ext_id as argument. It is less powerful and should not be used any longer.) The method 'close_in' does not require an argument.

It is allowed to re-open a resolver after it has been closed. It is forbidden to open a resolver again while it is open. It is allowed to close a resolver several times: If 'close_in' is invoked while the resolver is already closed, nothing happens.

The method 'open_rid' may raise Not_competent to indicate that this resolver is not able to open this type of IDs.

If 'open_rid' gets a PUBLIC ID, it can be assumed that the string is already normalized (whitespace).

The method 'change_encoding' is called from the parser after the analysis of case (2) has been done; the argument is either the string name of the encoding, or the empty string to indicate that no XML declaration was found. It is guaranteed that 'change_encoding' is invoked after only a few tokens of the file. The resolver should react as follows:

The following rule helps synchronizing the lexbuf with the encoding: If the resolver has been opened, but 'change_encoding' has not yet been invoked, the lexbuf contains at most one character (which may be represented by multiple bytes); i.e. the lexbuf is created by Lexing.from_function, and the function puts only one character into the buffer at once. After 'change_encoding' has been invoked, there is no longer a limit on the lexbuf size.

The reason for this rule is that you know exactly the character where the encoding changes to the encoding passed by 'change_encoding'.

The method 'clone' may be invoked for open or closed resolvers. Basically, 'clone' returns a new resolver which is always closed. If the original resolver is closed, the clone is simply a clone. If the original resolver is open at the moment of cloning: If the clone is later opened for a relative system ID (i.e. relative URL), the clone must interpret this ID relative to the ID of the original resolver.

method init_warner : Pxp_core_types.symbolic_warnings option ->
Pxp_core_types.collect_warnings -> unit
method rep_encoding : Pxp_core_types.rep_encoding
method open_in : Pxp_core_types.ext_id -> lexer_source
This is the old method to open a resolver. It is superseded by open_rid. This method may raise Not_competent if the object does not know how to handle this ext_id.

PXP 1.2: Returns now a lexer_source, no longer a lexbuf

method open_rid : Pxp_core_types.resolver_id -> lexer_source
This is the new method to open a resolver. It takes a resolver ID instead of an ext_id but works in the same way.

PXP 1.2: Returns now a lexer_source, no longer a lexbuf

method close_in : unit
method change_encoding : string -> unit
method clone : resolver
Every resolver can be cloned. The clone does not inherit the connection with the external object, i.e. it is initially closed.
method active_id : Pxp_core_types.resolver_id
Returns the actually used resolver ID. This is the ID passed to open_rid where unused components have been set to None. The resolver ID returned by active_id plays an important role when expanding relative URLs.

method close_all : unit

Closes this resolver and every clone

This method is no longer supported in PXP 1.2