Cocoa (
momijizukamori) wrote2021-04-03 02:22 pm
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Entry tags:
AO3 Reader for Kobo
Thinking things through out-loud here, mostly, as I think at the moment
adevyish is the only other person with a real stake in this project (though if you are intrepid Kobo ereader owner who would like to easily browse and read fic on your device, you may too be interested!). There's basically two main challenges in this project - the first is that AO3 is designed for web, and only web, with the 'download this fic' button being the only way to access fic data in a different format, and the other is that Kobo firmware is basically a very minimal Linux OS, with no prebuilt web engine binary.
For issue two, the two solutions are 'write something to parse HTML and render it to an image' or 'use an HTML renderer someone else has written'. If I absolutely HAD to, I could do the first bit in Python - I've used the image-rendering packages in Python before, and I've even dealt with low-level interfaces for eink frame buffers. But, frankly, that's a LOT of work. Like we take reflowing text for granted, and it's only when you have to specify every linebreak yourself that you realize the number of calculations that go into it. Which leads us to 'use someone else's renderer', and for that, there are really three options: KoReader, which is written in C and Lua; Plato, which is written in Rust; and the Webkit engine that shipped with Qt4, which is C++. I actually tried the last option first, because I have a bit of experience with Qt from undergrad, but it turns out crosscompiling it is a huuuuuge pain in the ass and despite many hours attempting it I still haven't been able to succesfully crosscompile even sample code. Which leaves KoReader and Plato, and as I was going to be about equally lost in the code (as these are both single programs that happen to be OSS, rather than a general-purpose framework like Qt, and thus have basically no code documentation), I opted to go with Plato because I think going forward in my life, Rust will be handier, and Plato's GUI is nicer/more polished.
So - we've got Plato for handling the HTML -> framebuffer rendering, and for all the GUI components, which leaves me to actually write the solution to problem one, fetching data from AO3 and rendering it into a format that is ereader-friendly, which basically involves fetching pages from AO3, parsing and scraping them for particular bits of content, and then feeding that cleaned content into Plato's HTML rendering engine. And then providing GUI elements that translate on the backend to specific requests to AO3, like posting a comment, leaving kudos, bookmarking a fic, etc. I really wish there were an API, because scraping and trying to mimic web traffic this way is a lot more brittle, but lol that's never gonna happen.
I went back and forth a bit in my head about whether I should be writing this as like, an app within Plato, or as a more complete fork, and I'm leaning towards fork, because it won't be possible for stuff from this to exist in the same views as local documents (as I explicitly do not want to save these requests as local documents - the data will only ever exist in memory, like it does in a browser), so I'm going to have to write my own 'Home' interface anyway. I'm kind of thinking of this in terms of views that need to be written, and what needs to be done for each one.
Work view
- The minimum viable product, to use business-speak terms, and the bit I've actually started on
- Still struggling to decide if I want to fetch the whole work at once, or individual chapters. Drawbacks to whole-work: will have to modify the document tree to add relative anchor links to chapter titles, no easy access to per-chapter comments. Drawbacks to single-chapter: have to rewrite Plato's chapter functionality to handle remote locations rather than relative ones, have to figure out a nice way to move to the previous/next chapter in the reader when they're not part of the same document.
- I'm trying to figure out where to put access to AO3-specific functionality - the work metadata, kudos, comments, etc etc. I'd like to include it as buttons on the overlay controls, but there is already quite a lot crammed in there already. I definitely don't want to show the work metadata on every chapter the way the web interface does, because on a limited screen size that would get obnoxious so very fast to page past.
- Maybe display controls should be moved into the dropdown? As you are substantially less likely to change them per-work than you are with other docs where the base formatting may vary.
List view
- Basically any view of a list of works - a tag view, search results, an author's works, etc.
- How the fuck am I going to deal with wall-of-tags/variable lengths of summaries in such a limited space, I don't know
- I don't think I can fully proxy the advanced search page, because the auto-complete relies on XHR requests and I think the lag for making the requests and redrawing the UI is going to be too big. The sort-and-filter sidebar view is doable though because that data is populated on page load.
- Actions and metadata will vary based on the view, and go either at the top or bottom of the screens - space is somewhat less of an issue here because there'll be fewer display controls.
- Have to figure out how to handle pagination smoothly, because the AO3 results page size and the Kobo screen view page size will not be remotely the same, and will need to be tracked seperately in the backend.
Home view
- The starting point for the app. Should handle login/logout functionality, and have paths to account view stuff, and browse.
- Going to replicate the favorite tag view, but ideally I'd like to add logic to be able to favorite ANY list view. Particularly filtered search views, so I don't have to keep resetting the filters to remove results in languages I cannot read. And AO3 refuses to add this functionality themselves, so.
There's probably also some assorted smaller views (like fandom browse), but I figure I'll get to those when I need them. When will this all get done? Who knows! But the idea won't leave me alone and reading fic on my phone in bed is not great for my eyes, my shoulders, or my sleep cycle, so I have incentive. And getting to the point of 'load HTML from memory, not from disk' was a major breakthrough so I have motivation to work on this.
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
For issue two, the two solutions are 'write something to parse HTML and render it to an image' or 'use an HTML renderer someone else has written'. If I absolutely HAD to, I could do the first bit in Python - I've used the image-rendering packages in Python before, and I've even dealt with low-level interfaces for eink frame buffers. But, frankly, that's a LOT of work. Like we take reflowing text for granted, and it's only when you have to specify every linebreak yourself that you realize the number of calculations that go into it. Which leads us to 'use someone else's renderer', and for that, there are really three options: KoReader, which is written in C and Lua; Plato, which is written in Rust; and the Webkit engine that shipped with Qt4, which is C++. I actually tried the last option first, because I have a bit of experience with Qt from undergrad, but it turns out crosscompiling it is a huuuuuge pain in the ass and despite many hours attempting it I still haven't been able to succesfully crosscompile even sample code. Which leaves KoReader and Plato, and as I was going to be about equally lost in the code (as these are both single programs that happen to be OSS, rather than a general-purpose framework like Qt, and thus have basically no code documentation), I opted to go with Plato because I think going forward in my life, Rust will be handier, and Plato's GUI is nicer/more polished.
So - we've got Plato for handling the HTML -> framebuffer rendering, and for all the GUI components, which leaves me to actually write the solution to problem one, fetching data from AO3 and rendering it into a format that is ereader-friendly, which basically involves fetching pages from AO3, parsing and scraping them for particular bits of content, and then feeding that cleaned content into Plato's HTML rendering engine. And then providing GUI elements that translate on the backend to specific requests to AO3, like posting a comment, leaving kudos, bookmarking a fic, etc. I really wish there were an API, because scraping and trying to mimic web traffic this way is a lot more brittle, but lol that's never gonna happen.
I went back and forth a bit in my head about whether I should be writing this as like, an app within Plato, or as a more complete fork, and I'm leaning towards fork, because it won't be possible for stuff from this to exist in the same views as local documents (as I explicitly do not want to save these requests as local documents - the data will only ever exist in memory, like it does in a browser), so I'm going to have to write my own 'Home' interface anyway. I'm kind of thinking of this in terms of views that need to be written, and what needs to be done for each one.
Work view
- The minimum viable product, to use business-speak terms, and the bit I've actually started on
- Still struggling to decide if I want to fetch the whole work at once, or individual chapters. Drawbacks to whole-work: will have to modify the document tree to add relative anchor links to chapter titles, no easy access to per-chapter comments. Drawbacks to single-chapter: have to rewrite Plato's chapter functionality to handle remote locations rather than relative ones, have to figure out a nice way to move to the previous/next chapter in the reader when they're not part of the same document.
- I'm trying to figure out where to put access to AO3-specific functionality - the work metadata, kudos, comments, etc etc. I'd like to include it as buttons on the overlay controls, but there is already quite a lot crammed in there already. I definitely don't want to show the work metadata on every chapter the way the web interface does, because on a limited screen size that would get obnoxious so very fast to page past.
- Maybe display controls should be moved into the dropdown? As you are substantially less likely to change them per-work than you are with other docs where the base formatting may vary.
List view
- Basically any view of a list of works - a tag view, search results, an author's works, etc.
- How the fuck am I going to deal with wall-of-tags/variable lengths of summaries in such a limited space, I don't know
- I don't think I can fully proxy the advanced search page, because the auto-complete relies on XHR requests and I think the lag for making the requests and redrawing the UI is going to be too big. The sort-and-filter sidebar view is doable though because that data is populated on page load.
- Actions and metadata will vary based on the view, and go either at the top or bottom of the screens - space is somewhat less of an issue here because there'll be fewer display controls.
- Have to figure out how to handle pagination smoothly, because the AO3 results page size and the Kobo screen view page size will not be remotely the same, and will need to be tracked seperately in the backend.
Home view
- The starting point for the app. Should handle login/logout functionality, and have paths to account view stuff, and browse.
- Going to replicate the favorite tag view, but ideally I'd like to add logic to be able to favorite ANY list view. Particularly filtered search views, so I don't have to keep resetting the filters to remove results in languages I cannot read. And AO3 refuses to add this functionality themselves, so.
There's probably also some assorted smaller views (like fandom browse), but I figure I'll get to those when I need them. When will this all get done? Who knows! But the idea won't leave me alone and reading fic on my phone in bed is not great for my eyes, my shoulders, or my sleep cycle, so I have incentive. And getting to the point of 'load HTML from memory, not from disk' was a major breakthrough so I have motivation to work on this.
no subject
no subject
So I own an older generation one, so the newer models may have more features (though I think they build the same firmware for all their devices?) - but with mine, the built-in reader app can open locally saved HTML and render it just fine, but there's no way to load data from a network stream. There's a slightly hidden 'experimental' browser, which I'm fairly certain is just a wrapper around the Qt4 Webkit engine, which has no session restore support and does nothing to try to make the experience eink-friendly. And let me tell you, scroll on an eink display is a rough experience - the maximum frame rate tops out at around three frames a second, which is with partial refresh and thus a fair bit of ghosting.
no subject
I bought mine last year and it’s the exact same issues. Been getting around it by memorizing the author name while on my phone, typing the author page URL into the experimental browser, then finding the fic I want and hitting download ^^;;
no subject
Not surprising - the Nook I had before my Kobo didn't even have that much, iirc
no subject
no subject
The eink technology has gotten better - they don't need the full-negative screen-refresh every page flip that the early devices had, and the refresh rates have gotten better, but it'd still need some significant innovations to overcome the current physical limitations - and unfortunately the modern web is a place where stuff is moving all the time, and you need pretty solid memory specs. And I'm pretty sure Chromium is basically not supported on single-core processors at this point - and more powerful processors and memory eat into the battery life that's one of eink strengths.
tl;dr - I am also frustrated by the limitations (see: this entire post) but I also understand why they're there.
(oh, and if you're still researching, you might want to post here and get ideas from the massive ereader nerds on that forum)
no subject
The default Kobo list view is also terrible for telling you anything about a story, so I’m pretty used to tapping into the first page of a fic to figure out if I want to read it today or not. So I’d be perfectly ok with only the first relationship tag (or character tag if that doesn’t exist) and major warning tags, because that’d still be an improvement on Kobo lol. Not sure how much I’d want of the summary.
The tag auto-complete is such a struggle even on a proper browser, I s2g the XHR request that comes back is always one or two characters away from what I was typing. Since it seems to check what you currently type against what comes back, it feels like I never get auto-complete back (unless I delete a character or two to get a request that’s already returned).
Particularly filtered search views
Ah yes, my entire folder of bookmarks that’s just
kudos > 100 -[specific trope]
for every ship I check.no subject
Hmmmm this actually gives me an idea, I will try to mock it up later.
They don't do any event debounce on input, which is a valid choice for auto-complete I guess, but means if there's any extra delay the requests are just gonna get backed up in the system and lead to lag in the refresh, unfortunately.
My solution is to just leave my search tabs open forever, but god, it wouldn't be that hard to give us the ability to save the search to our account, why won't you let us have nice things AO3.
(I put in a support request specifically for this feature at one point and got told 'no' so it's not like they don't know it's a thing people want)
no subject
no subject
I download long stuff, or stuff I know I want to re-read, but it's too much effort for when I fling myself through four pages of search results over the course of a couple hours.
no subject
no subject
Theoretically you could cross-compile it to any architecture Rust targets, probably that uses the same display type as Kobos (I haven't looked at the code closely enough to see if it implements the display drivers itself or uses bindings from another package). In practice, as far as I know Plato has only been compiled for Kobo devices, and most of the other mainstream ones make it hard-to-impossible to run arbitrary binaries (It's not super-easy on Kobo devices either, but it's relatively straightforward and doesn't involve having to seriously modify the device and/or circumvent security protections)