[Test post] Wan Wan Train Doko e Iku? - Episode 10 discussion

Rexer · edit-2 7 months ago

[Test post] Wan Wan Train Doko e Iku? - Episode 10 discussion

wjs018 · 7 months ago

Just saw this post in the local feed. I guess now that somebody else is planning on using my tool, I need to make sure that I have proper upgrade procedures in place as I make improvements. I will do my best!

If you run into any issues, feel free to reach out or make a github issue or pull request.

WTree · 7 months ago

I am testing this tool for AUM community.

I don’t have any experience with hosting bots so this is my first time. I am OK with Linux but not with Python, so I am having a little trouble with Python errors.

Thank you for this tool.

wjs018 · 7 months ago

It was my first time hosting a bot as well, so I get it. Feel free to hit me up on matrix or I am on the discord server linked in the sidebar of !lightnovels@ani.social as well if anything comes up. I am thinking now that I didn’t really design things with multiple languages in mind, so there could be some issues that come up with that I could try to address (like text written in English that isn’t user-customizable without digging into the source). Let me know!

WTree · 7 months ago

I hope you don’t mind continuing here. So that other mods of this sub can see and discuss what they think.

For language setting, I went and looked around Lemmy API. You can easily post as other language with just language_id. With a few modification to src/lemmy.py, added language_id to line 67 and 73 under submit_text_post, I went ahead and tested out.

This is the post. This problem is Lemmy thinks that new post is the same as this post. So the new post is not showing up in the community, weirdly showing as cross-posted to the same community in this post.

Also can you guide me how to get language_id of certain language? For this test, I went and curl "https://ani.social/api/v3/resolve_object?q=https://ani.social/post/4225708" curl the post using the language I want.

wjs018 · edit-2 7 months ago

The cross posting issue is due to the fact that any two lemmy posts that point to the same url are automatically collapsed into a crosspost. This is the main reason that I have rikka set to not have any links pointing to the images. I thought it would be a nice feature, but the crosspost thing prevented it.

To disable having an image link when the bot makes posts, you want to set submit_image = none in the [options] section of your config file.

Now that 0.19.4 is live on this instance, I wonder if the thumbnail links are handled differently. I will have to do some testing around that.

As for language id, here is the page where pythorhead has all the language ids listed out.

edit: as of now, pythorhead doesn’t support setting a thumbnail url separately, so that won’t be possible yet.

WTree · 7 months ago

As for language id, here is the page where pythorhead has all the language ids listed out.

Thanks. This helps a lot.

wjs018 · 7 months ago

Just wanted to follow up and let you know that I just added the language_id setting to the config file. It works for posts as well as comments that the bot creates or edits.

WTree · 7 months ago

I just added the language_id setting to the config file

Thank you for language implementation. Just tested and it works.

cross posting issue is due to the fact that any two lemmy posts that point to the same url are automatically collapsed into a crosspost

That is what I thought.

edit: as of now, pythorhead doesn’t support setting a thumbnail url separately

For now, I have disabled posting cover photos.

WTree · edit-2 7 months ago

I am having trouble with the bot finding newly aired episodes in the database.

I have populated database following GitHub guide. And I have enabled discovery_enabled and show_discovery in config file.

python src/rikka.py -m setup
python src/rikka.py -m edit_season spring 2024
python src/rikka.py -m update all
python src/rikka.py -m episode

When I run python src/rikka.py -m episode, I get 0 episode have aired.

DEBUG | Current length of deque is 5
DEBUG | Starting new HTTPS connection (1): graphql.anilist.co:443
DEBUG | https://graphql.anilist.co:443 "POST / HTTP/11" 200 None
INFO | Found 71 upcoming episodes and discovered 0 new shows
INFO | Checking for episodes that have aired.
DEBUG | Querying database for upcoming episodes that have aired.
DEBUG | Found 0 episodes that have aired in the database.
INFO | Found 0 episodes that have aired.
INFO |

I can manually create a post by providing anilist_id and episode number like python src/rikka.py -m episode 136804 10 for current season Demon Slayer Episode 10.

wjs018 · edit-2 7 months ago

You probably just need to wait for some episodes to air. All that output looks fine to me. Some more details…

When you run the episode module the first time is when it will discover any new shows and fetch all the episodes that are airing in the next days in the future (set in the [options] section).

So, what this means is that if you run the episode module the first time, no posts will get created since it is only looking forward in time and not backwards. From that point forward, if you run the episode module again after the api-provided airing time (plus the configured delay parameter in the [post] section) for an episode, that is when a post will get created.

For a reference on upcoming episodes, you can check out the airing page on anichart, which is fed using the same api.

The reason that the episode module is only forward looking in time is that I found that the api gets way less reliable once you start looking at airing times in the past.

A general flow of how the episode module operates in an ongoing manner:

When the episode module is run, all upcoming episodes are fetched for the configured time to look ahead
From this, show discovery happens by checking whether the shows the episodes belong to are already in the database (Shows table) or meet the discovery criteria configured to be added to the database
The previously fetched list of upcoming episodes are filtered and for any that belong to shows in the database, these are added to the database (UpcomingEpisodes table)
Now, with a database that is fully updated with airing times, the module will fetch any episodes from the UpcomingEpisodes table that have an airing time with a timestamp earlier in time than the current time. These are the episodes that have “aired.”
These aired episodes are then checked to see if the show is enabled or disabled in the database (also, if enabled, the previous episode for the show if it exists, is checked for its engagement criteria)
- If the aired episode belongs to an enabled show and the previous episode meets the engagement criteria, then the bot will create a post (and edit previous posts as well as add it to the Episodes table in the database).
- If the previous episode fails to meet engagement criteria, then no post is made and the show is disabled.
If no post is created for the show (for whatever reason), then the episode will be moved to the IgnoredEpisodes table in the database. This is the table that the bot pulls from for its user-requested threads via pm (the listen module)

I think that about covers the episode module. That was done basically from memory as I can’t really dig into the code from this machine, so if something I said doesn’t add up or if a variable name is slightly different, that would be why.

WTree · 7 months ago

You probably just need to wait for some episodes to air.

I waited for new episodes and yes it is working now.

A general flow of how the episode module operates in an ongoing manner

Thanks for explaining the module in detail. If I understand it correctly

the module checks all upcoming episodes
filter those episodes according to Shows table and added to UpcomingEpisodes table
episodes from UpcomingEpisodes table that have earlier timestamp than current time are marked as ‘aired’
if criteria meets, the bot creates posts for those ‘aired’ episodes

wjs018 · 7 months ago

That’s pretty much it! There are tons of shows going on at any given time, so I tried to add features to stem the flood of discussion posts. Some tips for managing things:

Under the [options] section, I have discovery_enabled = false. I realize now that this was named poorly, but this doesn’t turn off discovery all together, instead, just makes sure that any new shows it does discover are disabled by default, meaning that it won’t make new posts for them as they air.

You can use the disable module to mark shows as disabled in the database. I have instructions in the readme for how to disable different sets of shows (including just disabling everything).

Speaking of disabling everything…that is how I start the new season of shows. I then make a thread about the upcoming season, asking people what they are planning to watch, and go through and enable each of those shows manually using the enable module. I think that overall this worked pretty well last season. Spoiler alert is that Monday’s general thread in the anime community will be the Summer version of that process.

When a user requests a show via PM (using the listen module), it also sets the show as enabled. So, if you want to start discussion threads that way, by requesting aired shows one at a time via pm, that is another option.

Finally, some additional tips on managing the backend of things:

I go through the scheduling process in the automation section of the readme. The listen and summary modules are run every 5 minutes while the episode is additionally run every 15 minutes.
On my system, I also added a third script that runs once a week at a time that doesn’t conflict with any of the other repeating scripts (I think mine is like 3:03 AM on a Sunday). This script runs the update module with the all parameter and runs the disable module with the finished parameter, just to do some housekeeping

Some other things that pop into my head:

Since the database used is sqlite, it is simply a file on the filesystem. I back it up using rsnapshot. This is the page that I used to help me configure it. I haven’t had to use the backups yet, but it does give a little peace of mind.
I also periodically copy the database file onto a different machine (no automation here). I simply scp (linux) or WinSCP (Windows) over a copy. I often end up using this to help with development more than anything since it is nice to have a prepopulated database when testing new features.
If you want to browse the data in the database file to try to track down a problem or just explore how things work under the hood, then I recommend DB Browser, it’s a nice tool for people like me that are SQL noobs.

WTree · 7 months ago

Thanks you for your tips. I will keep those in mind.

Since the database used is sqlite, it is simply a file on the filesystem. I back it up using rsnapshot.

Currently I am testing on my laptop so maybe I will just use rsync to backup to external drives for now.

I recommend DB Browser

Yesterday I was searching how to open sqlite file and found this and installed it. It is great to visually see how the data are stored in sqlite database.

Kyouma · 7 months ago

First of all, I am grateful for your tool. We are still struggling as we just landed here. It is our first time hosting a bot for a discussion and wishing you all the best with improvements to your tool.

Episode	Link
2	Link
10	Link