Data storage formats
One of the most vexed questions for me with digital text files is, what format is best to store my personal writings in? For many years I have been keeping these in web page format (.html) as it seems to be reasonably future-proof, is relatively easy to learn (HTML is a markup language, not a programming language), I can edit files in any text editor, and view them in any web browser offline. I can also link between files (a primary purpose of the HTML format). The disadvantage of HTML files is that creating them requires a bit more manual work than, say, using a dedicated word processor such as Microsoft Word (or its open-source equivalent, LibreOffice).
Plain text files (.txt) are the most future-proof as these can be read in any text editor on any platform. A disadvantage of them is that they can’t be linked between – this is where HTML files have the advantage. (HTML files are just text files with a special file name extension so that a browser will open it for viewing). HTML markup language is, however, more verbose.
I have spent a lot of mental energy and time in trying out various dedicated programs for my creative projects, such as various wikis and assorted dedicated editors. But I always seem to come back to making my own files as I am distrustful of the longevity of such programs made by others. There is no guarantee that these will be maintained into the future.
The quest for an editor
Below are some quoted entries from my Journal on the topic. I tried out Dokuwiki, initally for a creative project.
I have been procrastinating a bit by my periodical dithering over what file format to use for my various projects (my story and journal, and so on). I came across this article at Wikibooks, “Choosing the Right File Format,” on “future-proofing” your work so that it can be read decades from now on whatever technology exists then. The consensus for documents includes simple plain text (
.txt) files, or basic Rich Text Format (.rtf) if visual formatting is wanted, and PDFs. For web pages, HTML is recommended, and for images,.jpgand.pngamong others. The main important feature is that the formats be open-source, not proprietary, so document editors such as Microsoft Word are not recommended for the long term. As a last resort, if I am unable to use or afford programs such as Microsoft Office in the future (the one I have installed was bought by Dad), there are open-source alternatives (sites such as PortableApps.com have an extensive collection).Related to this, I am not sure whether I want to keep using the wikis I have installed or not – there are advantages and disadvantages to them vs. static HTML files:
- I can edit them online – I can’t do this with HTML in my current setup (which is why I miss the old Yahoo! sites). To my knowledge, none of the cloud backup sites such as OneDrive or Dropbox allow editing of HTML files on their servers.
- I can use tags and search, which I can’t do with plain static HTML files.
- There is, however, no free website hosting for the wiki I use (Dokuwiki) – and I might not have the hosting I am on now forever.
- The wiki has a lot of backend files to enable it to function (literally thousands of files, though most are small) and this makes it a bit cumbersome (though not as much as MediaWiki). HTML is a lot simpler – just the HTML files, stylesheets and images are all that are uploaded.
I was looking at TiddlyWiki as an alternative – it operates through a browser and stores everything in one HTML file, though it needs some exterior software to function. I spent most of yesterday fiddling around with it, and found it a bit exasperating to figure out at times. It uses some different Wiki markup to Dokuwiki – this is a major irritation between different wiki formats, as there is sometimes no easy way to convert one type of markup to another. As I have a lot of wiki files now, it would take me weeks to copy and convert them to TiddlyWiki format. It is also difficult to incorporate images – these are generally expected to be hosted elsewhere, though images can be embedded into the document, but this greatly increases its size. It can be password-protected, but the only options are to set a password and only let those who know it view and edit it, or set no password and then anyone can edit it – there doesn’t seem to be an option for letting others view but only the owner edit.
So I have yet to find the ideal solution (namely, one program to rule them all), and waste a lot of time and mental energy dithering about it! Ultimately, I basically prefer HTML as I am familiar with it and, as a minimum, only need a text editor and a browser to write and display pages. If no browser is available, I can still read the
.htmlfiles as text.– 10/9/2014 entry
I tried copying my story (in HTML format) to Microsoft Word, but I found trying to format it was a fiddly and exasperating process. Writing in Word is a bit easier, but its underlying code is very obfuscatesd, while HTML coding is much more straightforward and accessible (though typing formatting tags is annoying); I can also get more fine-grained control over a page’s appearance.
I had a look at OneNote – the proprietary “free” note-taking program that resides in Microsoft’s OneDrive cloud – but I cannot export any notes into other formats such as HTML; it is very limited in what I can do with it and feels rather insubstantial and unstable. If I did not have an account I would not be able to use it. So that is out.
Someone at r/worldbuilding on Reddit mentioned yet another note-taking/wiki-type program called Twine; this seems very like TiddlyWiki in its coding and concept, though it has to be installed as a Python-based program, rather than just open as an editable HTML file in one’s browser, which is a bit off-putting
– prefer portable versions of programs). I seem to be always on the lookout for that one program that would meet all my requirements, but none so far do – and I certainly do not have any programming knowledge to write my own. (15/9/2015 entr
I tried out OneNote, a free Windows program that gets recommended on r/worldbuilding:
Regarding OneNote, I discovered that the free desktop version from the Microsoft site is more full-featured than the rudimentary tile version on the 8.1 Start Menu, so I downloaded and installed this to try. Initially it seemed nice to use, but it is still lacking in many features, such as automatic smart quotes and exporting to HTML pages (which Word both does). A Microsoft account is also necessary to use it, as it syncs between this and the user’s computer rather than save locally. So that is an annoyance. I can export a Notebook to PDF and some weird proprietary Microsoft formats such as
.xps(a pseudo-PDF imitator) and.oneNotebook files, and a single file to the odd.mhtformat (a bundled.htmlfile), but nothing else. The code produced is a mess and elements such as lists are incorrectly and unsemantically formatted as paragraphs with a bullet in front rather than the proper<li>tags (Word does this too, with lists), and italic and bold tags are just<span class="italic">, not the correct<i>(and there are no<em>or<strong>tags, which have separate meanings in HTML5). I can’t do definition lists at all (I use these a lot).So back to my plain HTML pages. The programs can be fun toys to play with but add more complexity to the basic task of producing information. I found this blog entry, On Trying New Writing Things Forever (note: some swearing) on the search for the perfect writing tool, but comes up short:
So what have I learned? First, don’t chase fads, at least not for a major project. A wiki might work for Tayler and Sanderson, and it was worth trying, but not on an already-troubled novel. Just, no. Second, don’t chase fads, period. There’s always some new tool or software that will make your writing go so much easier ohmyGOSH! It won’t. Which is to say it might, but you can’t make that determination in less than a year and you have other things to work on. Try stuff out, but don’t waste a lot of your time on anything less than brilliant.
(21/9/2015 entry)
OneNote can also have some glitchy syncing issues with OneDrive, and some users have had their entire notebook disappear. (Reddit posts: Just deleted 3 weeks’ worth of work; Just lost all of my notes.) And while the program is currently free, there is no guarantee it will be into the future.
I also tried TiddlyWiki, a single-page Javascript-based wiki editor that is written to within the browser it is open in. While I liked aspects of it, it also is a bit unstable:
My computer froze up for some reason last night when copying files to a backup drive, and I had to force-power it off. A couple of my story files got corrupted, but I copied replacements from another hard drive. I keep each chapter of my story in a separate file; that way if corruption happens only a chapter or two might be affected, where as if it were in one file I would lose the whole story. That is a reason I am not using TiddlyWiki anymore for keeping background information about my story’s world as all information is contained in the one file. If the browser in which TiddlyWiki is being used crashes, the whole program is wiped out – it uses Javascript to rewrite the file in the browser each time it is altered – as this user found out in the Google Groups forum:
konono#9:
Lost all my data in a tiddlywiki – something that doesn’t happen all too often nowadays.
I have a large number of tabs open in Firefox, and switch between a number of tabgroups, each with numerous tabs. I was adding text and images (via external links) to my wiki. Added a map link […] The link worked fine, except that the map was opened in the same tab instead of in a new tab, which I would have preferred.
At some point Firefox crashed. After restoring, my tiddlywiki was still there, but with 0 bytes! This shouldn’t ever happen. […] Thanks for all the sympathy but that’s not what I was looking for. I think fears of data loss can keep people from using tw and that would be too bad for this great tool.
Thomas Birkenstock:
Hi Konono,
just right a few minutes ago I had exactly the same issue … Is there any chance getting the wiki back? All my Data is gone … more than a year of work gone …
(11/11/2015 entry)
There is also a very convoluted process to generate each TiddlyWiki entry as a single HTML page; it involves installing Node.js and command-line procedure to achieve this – but that is decidedly not beginner-friendly!
I managed to achieve a command-line procedure in TiddlyWiki and Node.js, after two weeks or so of confused frustration – namely, generate a static site (multiple
.htmlpages) from a single-page TiddlyWiki document of mine (a wiki reference for my story). I made a plaintive plea in the forum and apparently I was missing a command. The instructions at the TW site are not detailed enough for beginners!
- For full (not portable) installation of node.js (install node.js first):
tiddlywiki starwarrior --init server- Go to
http://127.0.0.1:8080/in browser (or whatever address the server gives)- Import
twstarwarrior.htmlvia import function/button in control panel (need to do this whenever I add new items)tiddlywiki ./starwarrior --serverto start in browser- Have to Ctrl + C to stop server so I can type in following commands
- To render static site, change directory in the command prompt to \starwarrior\ TW folder (
chdir /D C:\Users\Ron\starwarrior), then:tiddlywiki --rendertiddler $:/core/templates/static.template.html static.html text/plain
tiddlywiki --rendertiddler $:/core/templates/static.template.css static/static.css text/plain
tiddlywiki --rendertiddlers [!is[system]] $:/core/templates/static.tiddler.html static text/plain(17/12/2014 entry)
More thoughts via Reddit:
tigerjerusalem:
As an avid note taker and control freak I’ve used everything: Evernote, onenote, Google Keep, simple note, Devon think and what not. Gave up on everything because I need to know where my files are and how to backup them.
Switched to plain text for notes, folders for organizing and Word in webview for articles from the Web. For tags, I use Tagspaces that works with my files instead of converting them to some obscure database.
spinwizard69:
This is the way to go if you value your data.
Plain text files do have their limitations but with things like Markdown and other text conversion utilities it is fairly easy to move a note into a more robust file. However one thing that I found that works for me is writing plain old HTML files.
Here is the thing with HTML and notes, the basics are fairly easy to manage and remember. Further keeping to the basics prevents a lot of browser compatibility problems. I like HTML for the ease of creating lists, be they bulleted or something else. Also a lot of my “notes” easily fit into tables and again basic tables are easy. A little CSS can dress up the file if needed.
Using the basics, and doing so from memory, has another positive effect your notes all end up looking the same. A theme if you will.
Of course this sucks with PDF’s. But the approach would be the same, use a file browser and the operating file system to take care of organization.
And from StackExchange:
I suspect people will object to me saying this, but still, wanted to give some food for thought:
Why not just keep plain text files, or documents made in whatever word processor you prefer?
I’m 32, and I’ve been writing on a computer since I was 18, so I have about 14 years of character and worldbuilding documents built up, for several different universes.
The trouble with deciding to put your character or world information in some flavor of software-of-the-month is that it tends to store data in a custom database or flat file schema. As you write more and more, and years pass, you are putting yourself in a position where, if the data format the software you use isn’t easily transferable to a newer program, you can abruptly find yourself in a position where your old data is difficult to access because your operating system changed, doesn’t support your old program, and nobody bothered to program an easy data migration path for YOUR software to whatever the new thing ends up being, because it was so niche.
You’ll be more insulated against this sort of issue if you decide on a naming scheme and just keep text files, or word processor files, on your hard drive or cloud drive in some sort of order. Since word processors in general are so widely used, you KNOW there’ll always be a way to convert text files or word processor documents to another format, and it probably won’t be all that painful. But if you start using really custom software, you might be in a frustrating position later with your worldbuilding and characer data.
Or … you might not – maybe you’re technical and you don’t mind doing data migrations. I’m technical and not afraid of software and honestly I still think they’re a PITA and time-consuming. I’d rather spend my time writing than sorting out my data and putting it into a new format.
I just wanted to give some food for thought before you or anyone tries to pick software thinking it’ll work wonders for the writing process.
Writing really comes down to words on a page … fancy databases are often good timewasters but can distract you from actually writing, and in the long term can make your old data difficult to access if you want it later.
So I have pretty much given up with editor experimentation and will stick with my .html pages. With the wiki editors, another disadvantage is that there is yet another markup syntax to learn (and it varies between the different brands of wikis). They also require software such as PHP to generate the files, which involves the installation of more dependencies to get them to function.
Links
- Choosing the right file format
- “Making your own static web site isn’t nostalgia. It’s the future of the web,” Neocities blog, 27/2/2015. An assertion that I agree with! Basic HTML “is durable, dependable, and lasts forever.” I have a lot of my personal files written in HTML; a bonus is that I can link between files.
- This Page is Designed to Last: A Manifesto for Preserving Content on the Web, Jeff Huang
- John Ankarstrohm: Writing HTML in HTML; Static versus dynamic web sites. “… with nothing but pure HTML, there is no threshold. When I used a static site generator, I always had to do a dozen small things – start the auto-refresh server, research how to do something – before I was ready to do anything. Now, creating a new theme, a new post, a new page or even a new site requires no setup – I just open up a HTML document and start writing!”
2:38 PM Tuesday, 15 June 2021