summaryrefslogtreecommitdiff
path: root/doc/bugs/forget_corrupts_non-ascii_chars.mdwn
Commit message (Collapse)AuthorAge
* remove old closed bugs and todo items to speed up wiki updates and reduce sizeGravatar Joey Hess2016-04-19
| | | | | | | | | Remove closed bugs and todos that were last edited or commented before Q3 2015. Command line used: for f in $(grep -l '\[\[done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=09-09-2015 --pretty=oneline -- "$f")" -a -z "$(git log --since=09-09-2015 --pretty=oneline -- "$d")" ]; then git rm -- "$f"; git rm -rf "$d"; fi; done for f in $(grep -l '|done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=09-09-2015 --pretty=oneline -- "$f")" -a -z "$(git log --since=09-09-2015 --pretty=oneline -- "$d")" ]; then git rm -- "$f"; git rm -rf "$d"; fi; done
* Fix encoding of data written to git-annex branch. Avoid truncating unicode ↵Gravatar Joey Hess2014-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | characters to 8 bits. Allow any encoding to be used, as with filenames (but utf8 is the sane choice). Affects metadata and repository descriptions, and preferred content expressions. The question of what's the right encoding for the git-annex branch is a vexing one. utf-8 would be a nice choice, but this leaves the possibility of bad data getting into a git-annex branch somehow, and this resulting in git-annex crashing with encoding errors, which is a failure mode I want to avoid. (Also, preferred content expressions can refer to filenames, and filenames can have any encoding, so limiting to utf-8 would not be ideal.) The union merge code already took care to not assume any encoding for a file. Except it assumes that any \n is a literal newline, and not part of some encoding of a character that happens to contain a newline. (At least utf-8 avoids using newline for anything except liternal newlines.) Adapted the git-annex branch code to use this same approach. Note that there is a potential interop problem with Windows, since FileSystemEncoding doesn't work there, and instead things are always decoded as utf-8. If someone uses non-utf8 encoding for data on the git-annex branch, this can lead to an encoding error on windows. However, this commit doesn't actually make that any worse, because the union merge code would similarly fail with an encoding error on windows in that situation. This commit was sponsored by Kyle Meyer.
* sign and repasteGravatar https://id.koumbit.net/anarcat2014-05-14
|
* (no commit message)Gravatar https://id.koumbit.net/anarcat2014-05-14