Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden errors by different order of toc/spine by EPUBCheck v4.2.0-rc #1036

Closed
shiestyle opened this issue Apr 6, 2019 · 32 comments · Fixed by #1056
Closed

Sudden errors by different order of toc/spine by EPUBCheck v4.2.0-rc #1036

shiestyle opened this issue Apr 6, 2019 · 32 comments · Fixed by #1056
Assignees
Labels
priority: critical To be processed and published ASAP spec: EPUB 3.x Impacting the support of EPUB 3.x specifications status: has PR The issue is being processed in a pull request type: spec The issue is related to a Specification update

Comments

@shiestyle
Copy link

We found EPUBCheck v4.2.0-rc will report errors when toc nav's order is different from spine order.
Now we found that wrong order was prohibited already in EPUB 3.0 but EPUBCheck v4.2.0-rc may be the first version to be checked properly.
(#888)

Unfortunately, we might already generate not a few EPUB files which have different toc order from spine because printed books have such toc order.

For example, our Manga content is structured like below.

[spine order]

  • Chapter/Story 01
  • Column 01
  • Chapter/Story 02
  • Column 02
    ...

[toc order]

  • Chapter/Story 01
  • Chapter/Story 02
    ...
  • Column 01
  • Column 02
    ...

Such different order of toc may be prepared because the value of chapter and column is different and such toc makes it easy for users to access to the prioritized chapters.
I'm not sure but other publishers in Japan may also have Manga contents with such toc of different order from reading order.

Although we will modify our EPUB files when we find to use such wrong order toc, I'd like to discuss the way to reconsider of this specification for reproducibility of printed books or I'd like to propose to change behavior of EPUBCheck to WARNING, or other messages, in order to keep availability of existing EPUB files.

As already reported mainly from Japan, errors by EPUBCheck will give critical impact because it will prevent from delivering EPUB files to eBook stores in Japan.

I appreciate if you have any interest this topic.

@Doktorchen
Copy link

Looking into my books, this rule can cause trouble for multilingual books (sections in different languages in each document chapter, but for each language a sublist in the toc) or for non-linear books (a document can be the next document for multiple documents in the toc).

The way out might be not to put everything in this specific nav element with ops:type 'toc', if it does not fit into the order.
The other option would be to accept, that digital books can be quite different from printed books, they do not always have exactly one spine or exactly one ordered navigation concept, this is one of the big advantage of digital books compared to printed books. However, EPUB seems not to reflect such advantages of digital books very much, it seems to prefer to reduce digital books to a simplified or second choice type of books (most internet bookstores seem to have the same strategy to force author to publish only minimalistic, simplified EPUBs compared to printed editions), not obvious, why.

Often, to provide such advantages to the audience, authors have to work around these restrictions.
Maybe, this is the option for those types of mangas/comics as well?
Another option could be to extend EPUB in version 4 for such digital books beyond the possibilities of printed books?

@rdeltour
Copy link
Member

rdeltour commented Apr 6, 2019

Thank you for the feedback @ShinyaTakami. I think this needs to be escalated to the CG, @dauwhe @RachelComerford @mattgarrish.

@rdeltour rdeltour self-assigned this Apr 6, 2019
@rdeltour rdeltour added spec: EPUB 3.x Impacting the support of EPUB 3.x specifications status: in discussion The issue is being discussed by the development team type: spec The issue is related to a Specification update labels Apr 6, 2019
@mattgarrish
Copy link
Member

mattgarrish commented Apr 6, 2019

It's a useful indicator of possible authoring error - that you shifted your content around but forgot to update your toc to match it - but that doesn't warrant a must/error.

There's also a case that it's confusing to the reader if we loosen the restriction, especially for persons with cognitive disabilities, but that's something better addressed by the accessibility specification/techniques.

I wonder, though, if it was an attempt to make the toc more compatible with translation to an ncx. Again, not sure that warrants a must, but would better explain the rationale for having the requirement.

@nekennedy
Copy link

It also flags out-of-order page-lists, which I'm seeing a lot of and think is a good idea.

Are there any use cases for out-of-order page-lists?

@Doktorchen
Copy link

Use cases are in general books, that do not really have (only) one reading order.
Such books have another structure than the simple spine model. This is more or less the model of an ordered list (XHTML:ol), an encyclopedia for example typically is more the model of no list at all, respectively in a printed book an unordered list (XHTML:ul).

In non-linear books the audience may have at the end of each document/chapter the choice between alternatives, how to continue - parts of the book may have some order, but at some points there is an unordered list of alternatives to chose.
They have an additional list (toc) for current alternatives at the end of each chapter - why not to combine all this into the main toc to provide an alternative access to the content?
Some chapters may appear multiple times in such a combined list.

Books with more than one language may have such a selection as well.
For comics or other books with the major content as graphics (SVG inside XHTML) may have the texts for the graphics in the same document, but alternatively in different languages.
Such a book may fit into the model of the spine, but in the toc one may want to provide one ordered list for each language, each list point related to a fragment into a document, that fits to the language.

Without graphics: If one provides the text in two languages, the audience can have different reasons or motivations. A part may want to read only the version in the first language, another in the second language, a third part may want to compare for educational purposes.
Therefore it is useful to provide three (or four) sublists in the toc, one for each fraction.
With the CSS pseudo-class :target one can even adjust the styling of the document, depending on fragment target listed in the sublist.

If one puts everything strictly in one order, the result can be to reduplicate a lot of content.
For some books (especially the non-linear ones), one will reduce the toc simply to frontmatter, backmatter and the beginning of the text and maybe an additional document with an extended toc for details and everything (without EPUB restrictions as for the ops:toc), that does not fit into an ordered list at all.
This might be the general workaround or solution, to avoid the error message for books with more complexity and options for the audience to select between alternatives (for accessibility or by concept of the book).

If EPUB wants to cover those use-cases, the spine could contain different structure, not just an ordered list, an unordered list as well and the choice between alternatives, the same applies for a toc.

@rdeltour
Copy link
Member

The EPUB 3 CG decided to keep this an ERROR during the April 11, 2019 con call.

I'll consequently close this issue, but feel free to keep on discussing it here. If you disagree with the CG decision, please open an issue on the CG repository.

@rdeltour rdeltour added status: wontfix The issue is rejected due to limitations (of scope or dev resources) and removed status: in discussion The issue is being discussed by the development team type: spec The issue is related to a Specification update labels Apr 22, 2019
@shiestyle
Copy link
Author

Let me reopen this issue because the impact will be critical in Japanese eBook market more than we expected.

@dauwhe
Copy link
Contributor

dauwhe commented Jun 12, 2019

Note this is not a change made in epub 3.2. The spec is clear in epub 3.0.1:

The order of li elements contained within the toc nav element must match the order of the targeted elements within each targeted EPUB Content Document, and must also follow the order of Content Documents in the Rendition's spine.

@shiestyle
Copy link
Author

We understand the current spec in EPUB 3.x and EPUBCheck 4.2.x will follow that but in Japan regulation-violated EPUB files were already generated when converting paper books to eBooks.

We will collect such examples existing in Japanese market by several publishers.

@rdeltour
Copy link
Member

@ShinyaTakami I hear you concerns. However, as noted above, reporting this issue as a mere WARNING or even disabling it would be inconsistent with the spec. I'm not saying it's totally impossible, but the EPUBCheck team cannot make this decision, this needs to be approved by the community (EPUB CG, possibly requiring backing from the Publishing BG). The last time we discussed this (April 11, 2019 call), the CG decided to keep it an ERROR.

@mattgarrish
Copy link
Member

I still agree with @ShinyaTakami that given evidence Japanese publishers have taken different approaches, it's too late to put this cat back in the bag. It's been almost ten years of not enforcing the rule, so there should be a compelling reason to change course now.

It feels arbitrary that we're going to let HTML in iframes fly under the radar, for example, but be draconian about this.

@dauwhe
Copy link
Contributor

dauwhe commented Jun 12, 2019

I know it sounds crazy, but what about having a path in epubcheck saying that if lang=jp then we skip this check?

@clapierre
Copy link

either that or have a command line option not to include that check.

@laudrain
Copy link
Collaborator

@ShinyaTakami will existing and already distributed EPUB ever go through EPUBCheck 4.2 ?

But as EPUBCheck 4.2 will be used on new titles coming in distribution, production spec may be revised now to cope EPUB3 spec.

Then do we have an issue?

@mattgarrish
Copy link
Member

The nav document requirements were a translation of the NCX rules into HTML, but this particular rule has no corollary in EPUB 3 that I know of since the navigation document isn't defined for playback -- media overlays took on that functionality.

Before breathing life into this unused requirement, we should have a solid case for why it is even needed. It seems like a more useful check for Ace to enforce.

@laudrain
Copy link
Collaborator

@mattgarrish do you mean it is only an accessibility requirement ?

@mattgarrish
Copy link
Member

do you mean it is only an accessibility requirement ?

As far as I can tell, ordering doesn't serve any vital function within the specification itself, unlike the other requirements for being able to process and extract the links to display. Like I wrote above, it looks like a bit of an over-zealous translation of NCX rules, where playback depended on the order of elements.

Accessibility is one important reason for having ordering, regardless of the technical arcana of how the rule came about, but we don't enforce those kinds of best practices through normative statements in the core specs. I think this could be better checked by Ace for those who are concerned about ordering.

@rdeltour
Copy link
Member

I don't think this is heavily impacting accessibility. It is not totally unrelated to WCAG 2.4.3 (focus order) or 3.2.3 (consistent navigation), but neither of those can be explicitly interpreted as requiring the ToC to be consistent with the document order.

But anyways, we're facing the old question of whether EPUBCheck should strictly follow the spec, or be lenient about widespread invalid content. As @mattgarrish said, we did loosen the checks for iframes and can very well do the same for this requirement too, if the community decides it's the right approach.

We should however be wary about asking EPUBCheck to willfully ignore the spec; if the logic is pushed to the extreme it would mean that we can never fix old wrongs or implement previously-unchecked spec rules. Ideally, these issues should be really fixed at the spec level. It's too bad we missed the opportunity to prune this requirement from EPUB 3.2 😞, if it is really unused and not necessary!

@mattgarrish
Copy link
Member

I don't think this is heavily impacting accessibility.

I believe it helps with cognitively being able to follow a document, part of the multiple ways success criterion. If you look at the technique for including a table of contents, one of the procedures for determining compliance is:

Check that the values and order of the entries in the table of contents correspond to the names and order of the sections of the document.

But to your point...

if the logic is pushed to the extreme it would mean that we can never fix old wrongs or implement previously-unchecked spec rules

Yes, this drives me crazy when we deviate!!

But in this case the rule itself is flawed and hasn't been enforced. In the face of evidence that people weren't aware of it and are constructing their content in different ways, I just don't see the need to rush and implement an error. Leave the status quo alone and let's take this up the next time.

@Doktorchen
Copy link

As long as this is mentioned in the specification, it is ok to be checked.
For authors of new books it is no problem to work around it, if required or useful for non-linear content with no preferred order.

However, still surprising, that this requirement applies to non-linear content at all, if mentioned within the navigation.
This means effectively, that authors have to provide an additional navigation for relevant non-linear content, they want to mention in a navigation.
I already started to update first books with this problem with additional navigation files beyond a basic navigation with this specific rule.

Accessibility concerns for linear content are understandable, however for non-linear content authors should know better than scripts, which arrangement provides a good access, respectively whether it might be useful to provide additional alternative arrangements for people with different approaches to understand a text or different capabilities.

@shiestyle
Copy link
Author

Thanks for your active discussions and sorry for my late reply because of business trip to overseas last week.

Now let me summarize this issue from my point.

[What problem in Japan?]

In Japan, EPUBCheck is used both when generating EPUB files by publishers and when accepting EPUB files by eBook stores (including Apple). So it's not convenient for us if existing EPUB files will be marked as ERROR when we have to re-deliver EPUB files to eBook stores because of changed phone # in colophon, for example, or when we have to deliver all of our EPUB files to new eBook stores.

So I have to say that it's not a problem for new EPUB files as an answer to @laudrain 's question.

Most of EPUB files in Japan are generated manually and we have to modify manually. Thus it's not easy to investigate how many EPUB files with TOC regulation-violated exist already and that's the reason why Japanese publishers have strong concerns to EPUBCheck's error.

KADOKAWA decided to modify their incorrect EPUB files when found but I heard some major publishers in Japan may not accept the new behavior of EPUBCheck 4.2. And they will request not to use EPUBCheck 4.2 or later to eBook stores.

That's the reason why I ask to re-open this issue.

[Specification in EPUB 3.x]

We are so sorry but for long time we were not aware of that 'different order of toc/spine' is not allowed in EPUB 3.0.

As already some people commented, I also think the current regulation can be changed and should be changed for expansion of EPUB format's ability because paper books can provide any kind of TOC but EPUB cannot due to this regulation.

EPUB 3.2 is already finalized and we Japanese publishers hope to discuss this issue as a topic for EPUB 3.2.1 or later.

[Behavior in EPUBCheck 4.2.x]

I understand the policy of following EPUB specification by EPUBCheck 4.2.x and EPUBCheck's behavior is correct because 'different order of toc/spine' is not allowed by even latest EPUB 3.2.

I also know, as an engineer, it's not appropriate approach to add irregular behavior or to add local behavior.

However, is it possible option that EPUBCheck will take features of future version of EPUB in advance if changing specification about TOC in the future will be agreed in PBG and EPUB3-CG?

[My suggestion]

I think the story below may be one of the reasonable solutions for this issue.

  1. Japanese publishers will propose to change the TOC spec in future version of EPUB 3.
  2. W3C PBG and EPUB3-CG will discuss this issue and agree with change in EPUB 3.2.1.
  3. EPUBCheck 4.2.2 will integrate the change about TOC spec in advance.

@laudrain
Copy link
Collaborator

@ShinyaTakami thanks for your detailed answer.

Your proposed plan has to be considered by EPUB3-CG and PBG.

Meanwhile, I'm wondering if reporting this issue as a mere WARNING would help in Japan. @rdeltour mentionned this as a possibility in his comment above.

@shiestyle
Copy link
Author

Yes, outputting as WARNING by EPUBCheck will be another solution for Japanese market.

@rdeltour
Copy link
Member

Yes, outputting as WARNING by EPUBCheck will be another solution for Japanese market.

If this is indeed reasonable for the Japanese market, I personally believe this could be the best solution:

  • it doesn't deviate too much from the spec (and the CG could possibly amend the spec in a later version to remove the MUST, or even add a note already to clarify the situation?)
  • it leaves the check in place for people who find it useful

Let's propose this to the CG and PBG.

@shiestyle
Copy link
Author

I could gather samples and usecases of TOC regulation-violated EPUBs in Japan.

We found many samples in magazines or specialized books.

[Low prioritised Contents like Columns]

Many cases were reported for treating columns (small piece of content or additional content in the book).

Such low prioritised contents are sometimes located in the last of NavDoc because it's easy to access to the main chapters by skipping.

A similar case was found in the traveling guide book that small maps in the page were treated as the same as columns.

On the other hand, we found the case of bonus content which is placed next of TOC in NavDoc (actual content is located in the last) in order to appeal such special content is bundled.

[Another index in Cooking Book]

We found the case in cooking books that the first TOC sorted by types of genres like meat, fish, soup etc. was placed by the order of chapters and the second TOC sorted by types of materials like tomato, egg, milk etc. was also placed additionally for better experience of searching recipes.

This multiple index feature should be discussed in the future to enhance capability of eBooks, I think.

[Magazine of Mangas]

Magazines for Manga contents are popular in Japan and TOC page tends to be located in the last chapter, after Manga chapters, and we sometimes move it to the top level area when generating eBooks in order to make it easy to find TOC.

[Magazine of Novels]

See attached TOC image (modified for content protection).
toc_of_magazine_sample

In this case, chapters are grouped by types and content of from page 022 is located in the columns section.

This is a kind of sample of paper book and we have to generate out-of-order NavDoc in EPUB for reproduction of the paper book.

@dauwhe
Copy link
Contributor

dauwhe commented Jun 24, 2019

I've opened in issue in the EPUB repo to discuss the spec question: w3c/epub-specs#1283

@dauwhe
Copy link
Contributor

dauwhe commented Jul 2, 2019

Cross-posting from EPUB 3 CG repo... there are two separate issues here, one on nav order vs document order within a single content document, and one on nav order vs spine order. Would both of these need to be changed to WARNING to meet the needs of Japanese EPUBs? Or only the latter?

@shiestyle
Copy link
Author

Will EPUBCheck evaluate the former one?

If yes, we should change both to WARNING.

@rdeltour
Copy link
Member

rdeltour commented Jul 2, 2019

Will EPUBCheck evaluate the former one?

EPUBCheck currently checks both using the same logic. It's not impossible to change that, but that's how it's currently implemented.

@laudrain
Copy link
Collaborator

laudrain commented Jul 2, 2019

@ALL following this thread and confirmation from @ShinyaTakami, I urge the EPUB3 CG to validate an immediate EPUBCheck change to WARNING instead of ERROR for those checks.

As @mattgarrish said

I still agree with @ShinyaTakami that given evidence Japanese publishers have taken different approaches, it's too late to put this cat back in the bag. It's been almost ten years of not enforcing the rule, so there should be a compelling reason to change course now.

Then, in the issue 1283 Dave started in EPUB3 CG, discuss spec evolution taking all aspects in account, particularly accessibility issues.

@rdeltour rdeltour added priority: critical To be processed and published ASAP status: in discussion The issue is being discussed by the development team type: spec The issue is related to a Specification update and removed status: wontfix The issue is rejected due to limitations (of scope or dev resources) labels Jul 11, 2019
@rdeltour rdeltour reopened this Jul 11, 2019
@rdeltour
Copy link
Member

rdeltour commented Jul 11, 2019

Resolution: the EPUB CG decided to make this check (NAV_011) a WARNING (on the 2019-07-11 call, minutes to come).

@rdeltour rdeltour added status: ready for implem The issue is ready to be implemented and removed status: in discussion The issue is being discussed by the development team labels Jul 11, 2019
rdeltour added a commit that referenced this issue Jul 11, 2019
As agreed on the 2019-07-11 EPUB CG call:
https://www.w3.org/2019/07/11-epub3cg-minutes.html

Changes the behavior introduced in #999

Fixes #1036
@rdeltour rdeltour added status: has PR The issue is being processed in a pull request and removed status: ready for implem The issue is ready to be implemented labels Jul 11, 2019
@dauwhe
Copy link
Contributor

dauwhe commented Jul 11, 2019

rdeltour added a commit that referenced this issue Jul 17, 2019
* feat: revert the spine/toc nav order check to a WARNING

As agreed on the 2019-07-11 EPUB CG call:
https://www.w3.org/2019/07/11-epub3cg-minutes.html

- changes the behavior introduced in #999
- adds an INFO message about NAV_011 being subject to change

Fixes #1036
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: critical To be processed and published ASAP spec: EPUB 3.x Impacting the support of EPUB 3.x specifications status: has PR The issue is being processed in a pull request type: spec The issue is related to a Specification update
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants