Commit Graph

225 Commits

Author SHA1 Message Date
b22eb27623 ContentProxy: replace ignoreUrl with new RuleBasedIgnoreOriginProcessor
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2020-04-25 15:59:23 +02:00
f39c5a2a70 Add new Helper to process Ignore Origin rules and RulerZ operator
This commits adds a new helper like RuleBasedTagger for processing
ignore origin rules. It also adds a new custom RulerZ operator for the
'~' pattern matching rule.

Renames 'pattern' with '_all' in IgnoreOriginRule entity.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2020-04-25 15:59:23 +02:00
0bddd34847 Added publication date on epub export 2020-04-06 16:14:36 +02:00
b12e23ad8a Cleanup cookie jar
As of latest Guzzle release, it's fixed so we can removed that code.
2020-03-29 11:39:49 +02:00
8d4ed0df06 Update deps
Also CS (because cs-fixer got an update)

Package operations: 0 installs, 26 updates, 0 removals
  - Updating twig/twig (v2.12.1 => v2.12.2)
  - Updating symfony/symfony (v3.4.33 => v3.4.34)
  - Updating doctrine/event-manager (v1.0.0 => 1.1.0)
  - Updating doctrine/collections (v1.6.2 => 1.6.3)
  - Updating doctrine/cache (v1.8.1 => 1.9.0)
  - Updating doctrine/persistence (1.1.1 => 1.2.0)
  - Updating doctrine/inflector (v1.3.0 => 1.3.1)
  - Updating symfony/mime (v4.3.5 => v4.3.7)
  - Updating swiftmailer/swiftmailer (v6.2.1 => v6.2.3)
  - Updating symfony/swiftmailer-bundle (v3.3.0 => v3.3.1)
  - Updating doctrine/dbal (v2.9.2 => v2.9.3)
  - Updating doctrine/instantiator (1.2.0 => 1.3.0)
  - Updating j0k3r/graby-site-config (1.0.93 => 1.0.94)
  - Updating phpoption/phpoption (1.5.0 => 1.5.2)
  - Updating symfony/http-client-contracts (v1.1.7 => v1.1.8)
  - Updating symfony/http-client (v4.3.5 => v4.3.7)
  - Updating sensiolabs/security-checker (v6.0.2 => v6.0.3)
  - Updating paragonie/constant_time_encoding (v2.2.3 => v2.3.0)
  - Updating scheb/two-factor-bundle (v4.7.1 => v4.8.0)
  - Updating symfony/phpunit-bridge (v4.3.6 => v4.3.7)
  - Updating composer/xdebug-handler (1.3.3 => 1.4.0)
  - Updating friendsofphp/php-cs-fixer (v2.15.3 => v2.16.0)
  - Updating doctrine/data-fixtures (v1.3.2 => 1.3.3)
  - Updating nette/schema (v1.0.0 => v1.0.1)
  - Updating nikic/php-parser (v4.2.4 => v4.3.0)
  - Updating sentry/sentry (2.2.2 => 2.2.4)
2019-11-12 14:18:58 +01:00
51d7f62b31 Add logger to FileCookieJar 2019-07-24 16:07:38 +02:00
9a80dcf11e Use a custom cookiejar to avoid error when the cookie is badly saved
It happens sometimes on wallabag.it, the json inside the cookie is badly saved and the json isn't valid. It generates an exception and avoid people to use the api and import contents.
To fix that, we use a dedicated `FileCookieJar`, which extends the default one from Guzzle to fix these issues.

Also updated deps
2019-07-24 10:42:20 +02:00
16e1c07553 Merge pull request #3271 from wallabag/store-resolved-url
Add `given_url` in Entry table to check if a redirected url has already added
2019-06-05 11:38:00 +02:00
7abda3ba52 Drop SimplePie
It was only used to make an absolute url when downloading images.
The deps is still there (in the `composer.lock`) because Graby use it (not for absolute but for encoding).
2019-05-29 17:05:12 +02:00
f3bfb875e9 Use hash given url to avoid duplicate
Using hashed url we can ensure an index on them to ensure it's fast.
2019-05-29 15:56:20 +02:00
b7fa51ae7d Added given_url in entry table
- Added index on entry table for given_url field
- Fix tests:

    The previous `bit.ly` url redirected to doc.wallabag but that url doesn't exist in the fixtures.
    I used our own internal "redirector" to create a redirect to an url which exist in the fixtures.

Also, updating current migration to use the new `WallabagMigration`.
2019-05-29 13:50:59 +02:00
2cbee36a01 Merge pull request #3944 from shtrom/always-hash-exists-url
Always hash exists url
2019-05-28 14:18:33 +02:00
6e68417f03 Fix tests after rebase 2019-05-28 12:02:17 +02:00
b6c1e1bacc Fix some tests 2019-05-28 11:44:20 +02:00
448d99f84e CS 2019-05-28 11:42:27 +02:00
1048c9c4a8 Configure timeout 2019-05-28 11:42:27 +02:00
5f08426201 Fix because of some breaking changes of Graby 2.0 2019-05-28 11:42:27 +02:00
bf9ace0643 Use httplug 2019-05-28 11:40:41 +02:00
0132ccd2a2 Change the way to define algorithm for hashing url 2019-05-24 15:17:46 +02:00
4a5516376b Add Wallabag\CoreBundle\Helper\UrlHasher
Signed-off-by: Olivier Mehani <shtrom@ssji.net>
2019-05-24 15:17:46 +02:00
423efadefc Set first picture as preview picture 2019-05-21 20:38:22 +02:00
9f0957b831 Merge remote-tracking branch 'origin/master' into 2.4 2019-05-15 14:38:07 +02:00
2dbb5b2307 Enable no-referrer on img tags, enable strict-origin-when-cross-origin by default
Fixes #3889

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-05-10 23:07:26 +02:00
844fd9fafc Fallback to default solution if Imagick fails 2019-05-10 16:52:01 +02:00
9306c2a368 Use Imagick to keep GIF animation
If Imagick is available, GIF will be saved using it to keep animation.
Otherwise the previous method will be used and the animation won't be kept.
2019-05-10 15:33:36 +02:00
531c8d0a5c Changed RSS to Atom feed and improve paging 2019-04-25 13:46:31 +02:00
3620dae1e6 Merge remote-tracking branch 'origin/master' into 2.4 2019-04-01 13:16:15 +02:00
bfd69c74e5 Merge pull request #3909 from wallabag/fix/html-not-defined
Fix PHP warning
2019-03-18 09:26:33 +01:00
8ca858ee73 Fix PHP warning
Looks like sometimes (usually from import) the `html` key isn’t available.
2019-03-18 06:23:41 +01:00
41d476d7e7 epub: fix exception when articles have the same title
This commit fixes an exception occuring when exporting as epub several
articles with the same title. The chapter filename is now derived from
title and url.

Fixes #3642

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-03-17 23:36:10 +01:00
9a7a0e1e6b epub export: fix missing cover image, only for exports of one article
Fixes #3602

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-02-18 00:16:05 +01:00
0182cdaec4 CS 2019-02-11 11:57:52 +01:00
1e0d8ad7b7 Enable PHPStan
- Fix error for level 0 & 1 (level 7 has 699 errors...)
- Add `updated_at` to site_credential (so the `timestamps()` method applies correctly)
2019-01-18 15:25:50 +01:00
3afc87426d CS 2019-01-15 09:49:22 +01:00
5e1f27767b EntriesExport: avoid else on $authors
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-09 16:26:19 +01:00
dac93644e8 EntriesExport: sanitize filename and fix tests
Filename will now only use a-zA-Z0-9-' and space.

Fixes remaining filename issue on #3811

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-08 15:13:35 +01:00
ad5ef8bca0 EntriesExport/pdf: move notice to the end, add metadata cover
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 23:36:41 +01:00
4944703edc EntriesExport/epub: add metadata to each entry's cover
Add metadata to the cover of each entry:

- Publishers
- Estimated reading time
- Date of creation ("Added on")
- Address (URL)

Related to #2821

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 21:44:14 +01:00
f810834623 EntriesExport: change authors and title when not single entry export
Change '{method} authors' (which gives 'Tag_entries authors' when
exporting a tag) to 'Various authors'.

When exporting a tag (tag_entries), change the title from 'Tag_entries
articles' to 'Tag {tag} articles'.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 21:44:14 +01:00
30cf72bf55 EntriesExport/epub: revert c779373f, move exportinfo to the end of the book
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 21:43:16 +01:00
edd1825b58 EntriesExport/epub: use sha1 sums for filenames, fix and rename title chapters
This commit renames entry chapters file using a sha1 sum of their title
for simplicity. Also we fix the 'Title' chapter duplicate issue by using
the hash of the related entry and the suffix '_title'.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 21:41:12 +01:00
063d5e7bda EntriesExport/epub: remove TOC page
This change only remove the rendered page of the TOC at the end of the
book, the TOC remains available to readers.

Fixes #3603

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-07 21:11:05 +01:00
bf22266a62 EntriesExport/epub: replace epub identifier with unique urn
We replace the title used as the unique identifier of the epub file with
a urn following the format:

  urn:wallabag:{sha1("wallabagUrl:listOfEntryIdsSeparatedByComma")}

This format is repeatable: it always gives the same uid for the same
list of entries.

Fixes #3811

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2019-01-06 23:29:32 +01:00
1b220426e2 phpcs
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 22:33:32 +02:00
6059967951 updateOriginUrl: remove 'query string' case from ignore list
Two urls with a different query string may refer to two different pages
so keep them both.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 22:27:27 +02:00
44e63667d9 updateOriginUrl: add comment blocks for the parse_url diff check
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 22:13:03 +02:00
5ba5e22a09 updateOriginUrl: rewrite some if, resolving feedbacks from PR
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 21:54:09 +02:00
b49c87acf1 ignoreOriginUrl: add initial support of ignore lists
Add the ability to specify hosts and patterns lists to ignore the given
entry url and replace it with the fetched content url without touching
to origin_url.

This initial support should be reworked in the following months to move
the hardcoded ignore lists in the database.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:42:09 +02:00
fc040c749d updateOriginUrl: add behavior when diff is fragment and query
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:08:58 +02:00
e07fadea76 Refactor updateOriginUrl to include new behaviors behaviors
- Leave origin_url unchanged if difference is an ending slash
- Leave origin_url unchanged if difference is scheme
- Ignore (noop) if difference is query string or fragment

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:01:16 +02:00