forked from wallabag/wallabag
update config from @fivefilters
This commit is contained in:
@ -1,3 +1,4 @@
|
||||
# 2015.07.08 [Marvin Dickhaus] fixed single_page_link
|
||||
# 2013.10.30 [rezor92] fixed single_page_link
|
||||
# 2012-12-23 [carlo@...] fixed half-assed headlines in articles, removed inline author profiles, adjusted picture captions
|
||||
# 2012-03-17 [dkless@...] Cut metadata parts in the beginning and the ends of the content block; copyright entries for pictures removed; Author fixed, not sure if old entries still valid (I left them); Weird problems with some pages addressed (see last section for removing hidden section)
|
||||
@ -5,8 +6,7 @@
|
||||
# 2011-08-23 [carlo@...] changed single page link to use print version: page works better, less ambiguity. Related cleanups and simplifications.
|
||||
# 2011-08-20 [carlo@...] added author, fixed date
|
||||
|
||||
|
||||
single_page_link: //a[@title='Auf einer Seite']
|
||||
single_page_link: //a[contains(@href, 'komplettansicht')]
|
||||
tidy: no
|
||||
|
||||
title: //title
|
||||
@ -24,6 +24,8 @@ strip: //p[@class="copyright"]
|
||||
strip: //div[@class="copyright"]
|
||||
#Removes pagination links at the end
|
||||
strip: //div[@class="pagination"]
|
||||
#Removes link to main page at the bottom of some articles (Zur Startseite)
|
||||
strip: //a[@href='http://www.zeit.de']
|
||||
|
||||
# Fix picture captions
|
||||
wrap_in(small): //p[@class="caption"]/text()
|
||||
@ -43,3 +45,4 @@ strip_id_or_class:"pagination"
|
||||
|
||||
footnotes: no
|
||||
test_url: http://www.zeit.de/kultur/film/2012-12/Kurzfilmtag
|
||||
test_url: http://www.zeit.de/kultur/2015-07/kapitalismuskritik-selbstberuhigung-armin-nassehi
|
||||
|
||||
Reference in New Issue
Block a user