Discussion:
[gs-bugs] [Bug 698776] - jbig2dec - jbig2dec fails to parse "immediate generic region" segment with unspecified length
b***@artifex.com
2017-11-24 09:18:20 UTC
Permalink
http://bugs.ghostscript.com/show_bug.cgi?id=698776

Bug ID: 698776
Summary: jbig2dec fails to parse "immediate generic region"
segment with unspecified length
Product: jbig2dec
Version: 0.14
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: P4
Component: Parsing
Assignee: ***@artifex.com
Reporter: ***@pdflib.com
QA Contact: gs-***@ghostscript.com
Word Size: ---

Created attachment 14488
--> http://bugs.ghostscript.com/attachment.cgi?id=14488&action=edit
JBIG2 image with "immediate generic region"

How to reproduce:

Invoke jbig2dec 0.14 like this:

$ ./jbig2dec -v 9 -o bug5200.png bug5200_p1_I1.jbig2
jbig2dec info file header indicates a single page document
jbig2dec DEBUG file header indicates sequential organization
jbig2dec DEBUG segment 0 is associated with page 1 (segment 0x00)
jbig2dec info Segment 0, flags=30, type=48, data_length=19 (segment 0x00)
jbig2dec info page 1 image is 196x1159 (7874 ppm) (segment 0x00)
jbig2dec DEBUG allocated 196x1159 page image (28975 bytes) (segment 0x00)
jbig2dec DEBUG segment 1 is associated with page 1 (segment 0x01)

Expected result:

jbig2dec creates PNG file from JBIG2 input file.

Actual result:

No PNG output file is created.

The special property of this file is that it contains an "Immediate generic
region" where the segment header has a value of 0xFFFFFFFF in the "segment data
length" field.

The JBIG2 specification "ISO/IEC 14492 : 2001 (E)" says this in "7.2.7 Segment
data length":

"If the segment's type is "Immediate generic region", then the length field may
contain the value 0xFFFFFFFF. This value is intended to mean that the length of
the segment's data part is unknown at the time that the segment header is
written (for example in a streaming application such as facsimile). ..."

As far as I understood jbig2dec's source code there is no special handling for
this case. In jbig2_data_in() the length of the segment is read, and then when
ctx->state is JBIG2_FILE_SEQUENTIAL_HEADER the following check determines that
there is not enough data available:

case JBIG2_FILE_SEQUENTIAL_BODY:
case JBIG2_FILE_RANDOM_BODIES:
segment = ctx->segments[ctx->segment_index];
if (segment->data_length > ctx->buf_wr_ix - ctx->buf_rd_ix)
return 0; /* need more data */

I assume that here code is be needed to determine the length of the segment by
parsing the "Immediate generic region" segment data.
--
You are receiving this mail because:
You are the QA Contact for the bug.
b***@artifex.com
2017-11-24 11:25:13 UTC
Permalink
http://bugs.ghostscript.com/show_bug.cgi?id=698776

--- Comment #1 from Stephan Mühlstrasser <***@pdflib.com> ---
Note that a PDF file that wraps this JBIG2 stream can be rendered correctly by
GhostScript, which was confusing initially. But after looking at the
GhostScript code it became clear:

http://git.ghostscript.com/?p=ghostpdl.git;a=blob;f=base/sjbig2.c;h=20052bf80e67c9e1dd2fe499fbf266ddbfe73a70;hb=HEAD#l197

/* process a section of the input and return any decoded data.
194 see strimpl.h for return codes.
195 */
196 static int
197 s_jbig2decode_process(stream_state * ss, stream_cursor_read * pr,
198 stream_cursor_write * pw, bool last)
...
if (in_size > 0) {
211 /* pass all available input to the decoder */
212 jbig2_data_in(state->decode_ctx, pr->ptr + 1, in_size);
213 pr->ptr += in_size;
214 /* simulate end-of-page segment */
215 if (last == 1) {
216 jbig2_complete_page(state->decode_ctx);
217 }
218 /* handle fatal decoding errors reported through our callback */
219 if (state->callback_data->error) return
state->callback_data->error;
220 }

Here the whole JBIG2 image is in in one chunk in memory, whereas in the
jbig2dec command line tool it is read piece-wise from file. In the context of
GhostScript jbig2_data_in() will also trip over the length check that does not
take into account that 0xFFFFFFFFF can occur legally, and it will return 0,
which is not checked.

Then the explicit call to "jbig2_complete_page(state->decode_ctx);" happens,
and jbig2_complete_page() contains this:

http://git.ghostscript.com/?p=jbig2dec.git;a=blob;f=jbig2_page.c;h=4af1adc5f781c146056cd2d90d2c424b304dd13b;hb=HEAD#l190

186 /* check for unfinished segments */
187 if (ctx->segment_index != ctx->n_segments) {
188 Jbig2Segment *segment = ctx->segments[ctx->segment_index];
189
190 /* Some versions of Xerox Workcentre generate PDF files
191 with the segment data length field of the last segment
192 set to -1. Try to cope with this here. */
193 if ((segment->data_length & 0xffffffff) == 0xffffffff) {
194 jbig2_error(ctx, JBIG2_SEVERITY_WARNING, segment->number,
"File has an invalid segment data length!" " Trying to decode using the
available data.");
195 segment->data_length = ctx->buf_wr_ix - ctx->buf_rd_ix;
196 code = jbig2_parse_segment(ctx, segment, ctx->buf +
ctx->buf_rd_ix);
197 ctx->buf_rd_ix += segment->data_length;
198 ctx->segment_index++;
199 }
200 }

This will fix the self-inflicted wound by now parsing the "Immediate generic
region" segment data.

Note that the warning "File has an invalid segment data length!" is
unjustified, as the segment data length 0xffffffff is allowed for "Immediate
generic region" segments.
--
You are receiving this mail because:
You are the QA Contact for the bug.
b***@artifex.com
2017-11-26 15:32:06 UTC
Permalink
http://bugs.ghostscript.com/show_bug.cgi?id=698776

Henry Stiles <***@artifex.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|***@artifex.com |***@hotmail.co.
| |uk
--
You are receiving this mail because:
You are the QA Contact for the bug.
Loading...