Commit graph

51 commits

Author SHA1 Message Date
Samuel Clay
5c44c34da5 Overriding story url. 2020-04-30 14:53:42 -04:00
Samuel Clay
057d19acf1 Removing Daily Skip from text importer. 2020-04-30 14:36:32 -04:00
Samuel Clay
e5a8ef5978 Fixing KeyError 2019-08-21 18:33:57 -07:00
Samuel Clay
9566a26795 Rewriting relative image urls in original text with absolute urls. 2019-08-21 18:23:02 -07:00
Samuel Clay
2e6ad3afda Adding new node app: original_text. To replace Mercury Reader. Thanks for all the text. 2019-04-13 15:29:14 -04:00
Samuel Clay
e74dde6e30 Fixing logging for mercury parser error. 2018-08-14 15:56:04 -04:00
Samuel Clay
f04e1a5279 Handling mercury text parsing error. 2018-08-09 09:47:10 -04:00
Samuel Clay
3b3ea98afd Handling no host error in text importer. 2018-07-16 10:56:33 -04:00
Samuel Clay
94114595a6 Sanity check on extracting image urls in Text view. 2018-01-18 08:06:32 -08:00
Samuel Clay
8421f667d7 Fixing broken image handling from Mercury Reader that was causing image urls with a srcset to be concat'd together. This one's for @yesthatjwz. 2018-01-17 16:51:06 -08:00
Samuel Clay
b7574a1ff7 No longer finding the largest image in a story if the text view already successfully found one. Also using Mercury's builtin image finder. 2017-11-02 22:09:37 -07:00
Samuel Clay
f242a49d24 Handling mercury errors. 2017-10-30 11:47:18 -07:00
Samuel Clay
2827b896b5 Handling issue when story has no original content. 2017-10-24 15:33:27 -07:00
Samuel Clay
ec7e032c28 Switching to Mercury text parser, which is an upgraded Readability. Using old readability as backup. 2017-10-24 15:28:36 -07:00
Samuel Clay
c476d89e1f Removing breaking text importer UTF-8 encoding. 2017-10-15 17:15:56 -07:00
Samuel Clay
ef51152bcd Updating readability class names to look for. 2017-09-29 10:50:13 -07:00
Samuel Clay
82cdae1e4d Extracting images from original text's noscript. 2017-03-23 16:28:47 -07:00
Samuel Clay
2c195cde2a Fetcing the original text now extracts the image url for others. 2017-03-23 16:06:06 -07:00
Samuel Clay
aee018f39c Upgrading Readability and forcing images to remain. THis should add a bunch of images back to the Text view. 2017-01-25 17:35:48 -08:00
Samuel Clay
c4830e3e95 Handling unicode encode errors in page/text handling. Also adding upgrade command for fabric when pip is non-trivial. 2016-12-05 22:09:05 -08:00
Samuel Clay
3ed96e338c Fixing page and text importer to correctly handling non-breaking spaces. 2016-12-05 17:40:39 -08:00
Samuel Clay
e43733ce30 Handling lxml parser errors for original text. 2016-06-28 16:11:46 -07:00
Samuel Clay
53e4998146 Merge pull request #835 from sv0/text_importer
Text importer
2015-11-30 16:03:50 -08:00
Samuel Clay
00846fd5b3 Handling more requests errors in text importer. 2015-11-30 13:02:17 -08:00
Slavik Svyrydiuk
553344e6b5 skip encoding checking. requests.get already did it for us.
requests.get always returns unicode in 'text'
2015-11-28 21:59:57 +01:00
Slavik Svyrydiuk
0c7b8478bc PEP8 fixes 2015-11-27 08:18:34 +01:00
Samuel Clay
43d1339ef0 Fixing broken story url. 2015-08-03 20:12:51 -07:00
Samuel Clay
22ae0e65e4 Removing gamespot.com feed from page and text fetchers. 2015-08-03 20:09:36 -07:00
Samuel Clay
bd334ef20f Boosting readability's ability to read Medium posts. 2014-07-21 14:22:07 -07:00
Samuel Clay
5e064fcc2e Handling PyAsn1Error 2014-05-27 13:08:21 -07:00
Samuel Clay
8086a4bc3f Fixing OpenSSL errors in text importer. 2014-05-22 15:15:34 -07:00
Samuel Clay
7fb8320fb6 Fixing unicode issues in text importer. Closes #224. 2014-04-07 13:04:39 -07:00
Samuel Clay
52fc2f713b from requests.packages.urllib3.exceptions import LocationParseError 2014-03-29 20:24:08 -07:00
Samuel Clay
5f8d36212d Handling another requests exception. 2014-03-29 17:17:30 -07:00
Samuel Clay
6e09610c75 Fixing readability bug in text importer. 2014-03-10 11:55:10 -07:00
Samuel Clay
dc9cb08070 Fixing feeds in saved stories. Adding ability to save only a URL with IFTTT and have it fetch content and title. 2014-02-18 12:39:57 -08:00
Samuel Clay
2b2c731bd5 Fixing a couple exceptions and turning off google news rss feeds temporarily. 2013-09-09 13:46:14 -07:00
Samuel Clay
88f2a69a93 Fixing a dozen text and feed fetching bugs. 2013-08-06 13:18:55 -07:00
Samuel Clay
1936515aa0 Fixing Dilbert-specific encoding error. 2013-07-18 15:17:15 -07:00
Samuel Clay
679195aadd Adding necessary exception handling to text view. 2013-07-15 11:06:56 -07:00
Samuel Clay
1fc3e05b89 Fixing small text importer bug with missing encodings. 2013-07-13 14:22:23 -07:00
Samuel Clay
f4abcc0ade Turning off SSL verification on text importing. 2013-07-10 17:03:12 -07:00
Samuel Clay
c18603c19a Fixing text view. 2013-07-10 14:14:43 -07:00
Samuel Clay
44744b4395 Fixing text importer. 2013-07-07 21:51:36 -07:00
Samuel Clay
1c3955a5ea Attempting a fix for encoding issues in Text view. 2013-07-05 16:13:20 -07:00
Samuel Clay
0eb2ccf4eb Turning down color warning for original text. 2013-06-30 17:12:41 -07:00
Samuel Clay
ef19cda260 Gracefully failing Text view fetch. 2013-06-09 03:01:36 -04:00
Samuel Clay
7e52fc37ef Fixing assortment of small bugs. 2013-01-28 15:43:00 -08:00
Samuel Clay
9c1d0c42dd Adding premium requirement for Text view. 2013-01-09 12:53:30 -08:00
Samuel Clay
bb716839b6 Correcting a few glitches in text view. Shaping up nicely. Just need premium only. 2013-01-08 19:29:02 -08:00