NewsBlur/apps/rss_feeds/management/commands/trim_feeds.py

import gc

from django.core.management.base import BaseCommand

from apps.rss_feeds.models import Feed


class Command(BaseCommand):
    def add_arguments(self, parser):
        parser.add_argument("-f", "--feed", dest="feed", default=None),

    def handle(self, *args, **options):
        if not options["feed"]:
            feeds = Feed.objects.filter(fetched_once=True, active_subscribers=0, premium_subscribers=0)
        else:
            feeds = Feed.objects.filter(feed_id=options["feed"])

        for f in queryset_iterator(feeds):
            f.trim_feed(verbose=True)


def queryset_iterator(queryset, chunksize=100):
    """
    Iterate over a Django Queryset ordered by the primary key

    This method loads a maximum of chunksize (default: 1000) rows in it's
    memory at the same time while django normally would load all rows in it's
    memory. Using the iterator() method only causes it to not preload all the
    classes.

    Note that the implementation of the iterator does not support ordered query sets.
    """
    last_pk = queryset.order_by("-pk")[0].pk
    queryset = queryset.order_by("pk")
    pk = queryset[0].pk
    while pk < last_pk:
        for row in queryset.filter(pk__gte=pk, pk__lte=last_pk)[:chunksize]:
            yield row
        pk += chunksize
        gc.collect()
Black formatting and isort 2024-04-24 09:50:42 -04:00			`import gc`

Trimming > 1000 stories from feeds. 2010-01-26 19:59:43 -05:00			`from django.core.management.base import BaseCommand`
Black formatting and isort 2024-04-24 09:50:42 -04:00
Trimming > 1000 stories from feeds. 2010-01-26 19:59:43 -05:00			`from apps.rss_feeds.models import Feed`

change management commands to use argparse instead of optparse 2020-06-08 07:55:17 -04:00
Black formatting. 2024-04-24 09:43:56 -04:00			`class Command(BaseCommand):`
change management commands to use argparse instead of optparse 2020-06-08 07:55:17 -04:00			`def add_arguments(self, parser):`
			`parser.add_argument("-f", "--feed", dest="feed", default=None),`
Trimming > 1000 stories from feeds. 2010-01-26 19:59:43 -05:00
			`def handle(self, args, *options):`
Black formatting. 2024-04-24 09:43:56 -04:00			`if not options["feed"]:`
			`feeds = Feed.objects.filter(fetched_once=True, active_subscribers=0, premium_subscribers=0)`
Trimming more than 500 stories from feeds. One day these can remain, but for now, it's too much to handle. 2010-09-22 14:35:26 -04:00			`else:`
Black formatting. 2024-04-24 09:43:56 -04:00			`feeds = Feed.objects.filter(feed_id=options["feed"])`
Fixing up trim feeds so we can trim old unused feeds. 2011-01-22 11:32:49 -05:00
			`for f in queryset_iterator(feeds):`
			`f.trim_feed(verbose=True)`
Black formatting. 2024-04-24 09:43:56 -04:00
Fixing up trim feeds so we can trim old unused feeds. 2011-01-22 11:32:49 -05:00
			`def queryset_iterator(queryset, chunksize=100):`
Black formatting. 2024-04-24 09:43:56 -04:00			`"""`
Fixing up trim feeds so we can trim old unused feeds. 2011-01-22 11:32:49 -05:00			`Iterate over a Django Queryset ordered by the primary key`

			`This method loads a maximum of chunksize (default: 1000) rows in it's`
			`memory at the same time while django normally would load all rows in it's`
			`memory. Using the iterator() method only causes it to not preload all the`
			`classes.`

			`Note that the implementation of the iterator does not support ordered query sets.`
Black formatting. 2024-04-24 09:43:56 -04:00			`"""`
			`last_pk = queryset.order_by("-pk")[0].pk`
			`queryset = queryset.order_by("pk")`
Fixing up trim feeds so we can trim old unused feeds. 2011-01-22 11:32:49 -05:00			`pk = queryset[0].pk`
			`while pk < last_pk:`
			`for row in queryset.filter(pk__gte=pk, pk__lte=last_pk)[:chunksize]:`
			`yield row`
			`pk += chunksize`
Black formatting. 2024-04-24 09:43:56 -04:00			`gc.collect()`