This provides the API and some implementations of sources with optional lookahead, intended for scanning text. These sources form the initial source and stack of conversions used by the more specialized lexical sources.
Initial sources are provided for strings and files. These can be altered with the cutext_source_stack_*
functions. In particular,
typedef char const* cutext_source_info_encoding_t |
The type of the CUTEXT_SOURCE_INFO_ENCODING property. This trivial definition is provided to help keep type casts correct for boxing and unboxing operations.
cu_bool_t cutext_source_can_look | ( | cutext_source_t | src | ) |
True iff src provides the cutext_source_look method.
void cutext_source_close | ( | cutext_source_t | src | ) |
Close src, which in turn should close its subsources.
size_t cutext_source_count | ( | cutext_source_t | src | ) |
Drain src and return the number of bytes which were left.
cu_box_t cutext_source_default_info | ( | cutext_source_t | , | |
cutext_source_info_key_t | ||||
) |
Returns NULL for encoding, and 8 for tabstop.
char const* cutext_source_encoding | ( | cutext_source_t | src | ) |
The name of the character encoding of src, or NULL
if unknown.
cutext_source_t cutext_source_fdopen | ( | char const * | enc, | |
int | fd, | |||
cu_bool_t | close_fd | |||
) |
Return a source over the contents read from fd encoded as enc. If close_fd is true, then close fd when cutext_source_close is called on the returned source.
cutext_source_t cutext_source_fopen | ( | char const * | encoding, | |
char const * | path | |||
) |
Return a source over the contents of the file at path encoded as encoding.
cutext_encoding_t cutext_source_guess_encoding | ( | cutext_source_t | src | ) |
Try to guess which Unicode encoding is used in src. This requires that src supports lookahead.
cu_box_t cutext_source_info_inherit | ( | cutext_source_t | src, | |
cutext_source_info_key_t | key, | |||
cutext_source_t | subsrc | |||
) |
Assist the source implementation src with a subsource subsrc in providing a suitable default value for key.
void cutext_source_init | ( | cutext_source_t | src, | |
cutext_source_descriptor_t | descriptor | |||
) |
Initialize the base src of a source implementation with callbacks provided by descriptor.
void const* cutext_source_look | ( | cutext_source_t | src, | |
size_t | size, | |||
size_t * | size_out | |||
) |
Request a lookahead of size bytes of upcoming data from src. If successful the data is returned and its actual size is assigned to *size_out
. The actual size may be larger than size, and only on the end-of-stream or in case of error may it be smaller.
This method is only provided by some source implementations, as reported by cutext_source_can_look. If needed, cutext_source_stack_buffer stacks a buffer onto any source to provide lookahead.
cutext_source_t cutext_source_new_cstr | ( | char const * | cstr | ) |
Return a source over the 0-terminated UTF-8 string cstr.
cutext_source_t cutext_source_new_mem | ( | char const * | enc, | |
void const * | data, | |||
size_t | size | |||
) |
Return a source over a size bytes of data stored from data and considered to be encoded as enc.
cutext_source_t cutext_source_new_str | ( | cu_str_t | str | ) |
Return a source over the UTF-8 string str.
cutext_source_t cutext_source_new_wstring | ( | cu_wstring_t | wstr | ) |
Return a source over the wide string wstr.
cutext_source_t cutext_source_no_subsource | ( | cutext_source_t | ) |
Returns NULL
.
void cutext_source_noop_close | ( | cutext_source_t | ) |
The trivial close operation.
size_t cutext_source_null_read | ( | cutext_source_t | , | |
void * | , | |||
size_t | ||||
) |
Always returns 0.
size_t cutext_source_read | ( | cutext_source_t | src, | |
void * | buf, | |||
size_t | max_size | |||
) |
Read at most max_size bytes into buf, returning the actual number read or (size_t)-1
on error. A return of 0 indicates the end of the stream, any other successful call reads at least one byte.
size_t cutext_source_skip | ( | cutext_source_t | src, | |
size_t | max_size | |||
) |
Skips over at most max_size bytes, returning the actual number skipped, or (size_t)-1
on error. A return of 0 indicates the end of the stream. If a previous call to cutext_source_look has returned a size of max_size or larger since the last read or skip, then this call succeeds with the full count.
cutext_source_t cutext_source_stack_buffer | ( | cutext_source_t | subsrc | ) |
Stack a buffer on top of subsrc to provide lookahead. The cutext_source_look method is guaranteed to be available on the returned source.
cutext_source_t cutext_source_stack_iconv | ( | char const * | newenc, | |
cutext_source_t | subsrc | |||
) |
Stack an iconv conversion filter on subsrc, encoding to newenc.