HTML vs ePub

I know ... this one sounds silly but it is not ... and I can prove it to you.

What is publishing about?

It comes from a verb "to publish". In that respect the exact format you do the publishing is just a matter of taste. Good or bad taste is ... how to say it ... it is something that you have it or not but .... but you have the right to have your own :-)
https://theamazingworldofpsychiatry.files.wordpress.com/2010/12/maneatingspaghetti.jpg

From that point of view it really does not matter what you use as long as it is at hand for the reader.
From an access point of view HTML has an advantage .... it is at the core of the internet and even at the base of EPUB ... so it is better (or shall we say say "it can be"? we know to well that not everything/everybody lives up to its possibilities)

Coming back to the actual question .. if we speak about taste then we shall decide first whether both formats are the same type fruits.

Does HTML and EPUB even compare in any way?

I know that this will always be a debate but publishing is about distributing information. It does not matter the channel, the looks, the way you get your hands on it or whatever else. As long as you manage to put the mighty information in the hands of your readers you are good to go. In that sense HTML and EPUB can be compared.
Nevertheless we can agree that there are differences between those fruits and bellow we have some of the major points in discussion.

So how do they compare?


Hold your horses ... it will not be a straight technical comparison. Too easy to do and if I am allowed to say it not so kindly .... it's a s###ty job with no serious outcome.

So here you have my points:

Self sufficiency

That means it has or not all it takes to do the job.
At first sight it looks like each format has a different scope and is self sufficient for different targets but is it?

HTML comes with no clothes so to speak. You have the text and some semantics inside it (like headings, lists and tables). Everything else (like images and heavy media and fonts for typography) is outside and for good reasons (if you need me to explain this we will need some time to get you thru some good old principles of accessibility and not only).
EPUB on the other hand includes everything that defines a publication in the standard way of thinking. It hives you everything inside HTML and then some. You have there the structure of the publication (like a summary), the fonts for good looks (if you like to include them), even heavy media can be included (like movies and sounds) not to speak about a separate cover.
And what?
Is that a first? Anybody here remembers the Internet Explorer Web Archive format? It was called MHT and did many things that EPUB does. Long time in advance? Was it used in any way? NO? Why? We can discuss here about the fact that it came from Microsoft and they have a great way of loosing this kind of wars but it was a different problem. Nobody cared about MHT and what it offered. Except for some very specific situations where you wanted to store or send a complete representation of a web page and had no other means to do it you've used MHT. That was all.  Most publishers did not knew it existed. Shall I also remember about the LIT format? It also came from Microsoft. Long story short it did not worked. I can recall also MOBI and REB and many others. Will discuss latter what is common for those formats.
So to speak I will say that both are self sufficient. It just depends what you expect from them and how you prepare for the battle between two products based on each of those formats.  
No winner here.

It gives Looks and Presentation

I like well designed products. Who doesn't? The look makes the difference. But who has the looks here?
HTML comes empty handed in the presentation area as said before. It was designed to share information with no fuss. Just put it there with paragraphs and headings and that was all that it was supposed to offer. And this is what it does.
EPUB on the other hand it has the usual edge. It gives you a cover that the library software of whatever other software it interacts with, can show it to you so that you know what book is that and how it looks. You have also all the bells of the modern era in it and you might think that you can also do some typography (I will not comment too much about this point ... I will reserve another post for it :-) ).
And what?:
I am a supporter of greatness in simplicity. You might disagree but if you use an Apple product and you like it you know what I am saying. Nobody stops you to put a CSS and some fonts and pictures along to you HTML files. You can call that package a book and be happy with it. Will that make it look good? It might but not necessary. The format is not the answer to great looking books. Neither is paper and ink and good old printing technology. It is about what you are doing with the format and the technology in your hand. So to speak, EPUB does not bring anything on the table from this perspective. It just inherits some characteristics of the HTML and its associated technologies and gives you the sensation that you have something new. It is not new. It is not better from a presentation perspective. It is just a convention that some companies agreed to present this to you in a specific format. A format that can look as bad as any other
A draw here also.

It is Ready for business 

Being economical and providing the grounds for success is important but it takes some guts to be sincere and define what will be good in the long run.

May be we shall first agree what "ready for business" means. If you ask the normal guy they will tell you that everything that brings money in the company is. My take is that the normal guy has bellow average entrepreneurial spirit and here are my reasons.
HTML was invented in the scientific sector and it missed the class on business reasoning. No digital rights management you will say, no "big guns" supporting the business case, no sales channels no nothing. Just a format to distribute content. But it does that very well and it is at the base of many great content related businesses right now and in the visible future (let's just mention the web itself and the social network phenomenon). What will all those be without HTML? 
EPUB on the other hand has all of the above. It offers DRM (actually the Reader/App that interacts with it offers it but that is a different story) ... It comes with the backing of all major publishers ... It has multiple sales channels (very real web shops selling books in this format) and we might find some other advantages.
And what?:
Let's start with some basic business reasoning:
Big business comes either thru implementation of basic economics principles (like productivity, repetability, reputation, ...) or thru straight innovation. Both EPUB and HTML are digital formats so implementing the basic economic principles comes by default.
If we talk about production you might even see some advantages for HTML. I know that for sure and anybody who was involved in the production of EPUB files knows also that it is not a straight job and you do not have ready made tools and you don't know for sure how will it be displayed on the end user screen and some other aspects that make your life miserable. It is much cheaper to produce straight HTML files than a simple EPUB. No need to explain to much but we can give it a try in the comments if you like. In this perspective you can say that HTML helps you keep some of your hard earned money in your pocket and this is very good business any day.
So we still have innovation. This is again related with what you do with the format not what the format offers you. You can build great things with both formats but let's be serious ... when you have a very open, clear and well understood "jack of all trades" format like HTML and an open and shiny but also very much enforced and a "Goliath like" format ... what will you consider to be the best choice for the base of your next big publishing product? May be you will not chose HTML for that because it looks so much "light" but you will most probably have a hard time comming with a really innovative product enclosed by the limits imposed by EPUB.
You can call it a draw also but realistically speaking EPUB was made as a business tool and it does not live up to it's expectations. If a business is not a straight winner that are more chances that it will be a flop in the end. It might not be the case always but in most cases it is.

Some conclusions before you get bored?

If we keep the same line and of logic HTML looks like a more rough approach. Something like a draft manuscript in comparison with the hardcover published book. I can agree on this, but is that in any way a disadvantage? In this frugal society does anyone think that the hardcover will make the difference on the long run? People today are starting to question the necessity to own "harder" things like the cars. In 20 years if not less who will own a car in the developed world when the option of using a shared vehicle (or just asking for one to come to you) will be so much more appealing. How many members of that society will buy paper books and keep them like a treasure when they do not own a car and may be not even the house?
Don't get me wrong. I have lead and ink in my blood for the generations of printers in my family and I am looking to buy a letterpress in the near future but that will not change the future of the printed books.
No Starship Enterprise will carry any printed books ... except may be ... for curiosity.

So ... going back to the business side of the problem ... shall you go with EPUB or HTML?

Neither of them is a guarantee for success but if I am going to put my money on this I will go with HTML any day.
It gives you freedom to innovate. It lets you encapsulate the content in any kind of business concept and distribution channel you might like to go ... It makes me feel light and nimble and able to build something that will go beyound what it was already created before. This is what business is about.
Enjoy.

Postgres first steps on Ubuntu

Long story short: I was deploying one of my platforms on a virtual machine running Ubuntu 16.04.
As usual I start by installing Apache, Mod Python, my preferred editor (Geany), copy the Python files in the right place and here we reach a more complex step ... getting the database up.

Actually it is not complex but there are some things that you have to remember. If you don't do it in the correct order you might loose some time getting it right and this just happened to me yesterday so this morning I decided to write this post. Why? Because everywhere I look on the web the script is bloated with stuff that is not needed.

Here we are:

# Open terminal and install PostgreSQL as following:

sudo apt-get install postgresql libpq-dev postgresql-client postgresql-client-common

# Open pg_hba.conf  to see if everything is OK.
# Your current PostgreSQL version might be different (mine is 9.5 as you can see)
# please see yours and change the path bellow accordingly
sudo nano /etc/postgresql/9.5/main/pg_hba.conf

# You should see the following line immediately after the long comments at the beginning:
local   all             postgres                          peer

# if different please make it look with the example given, save the file (CTRL+O) and exit nano (CTRL+X)
# restart the service (not needed if you did not changed the file)
sudo service postgresql restart

# Enter PostgreSQL cursor
sudo -u postgres psql

# Now you are at the helm and can start doing the actual interesting stuff
# Get control over postgres user ... you will need it :)
ALTER USER postgres PASSWORD 'password1';


# Create your custom user
CREATE USER my_user;
ALTER USER my_user PASSWORD 'password2';

# This one depends on what you need to do with your user
ALTER USER my_user WITH SUPERUSER;

# Create the database
CREATE DATABASE my_database;
GRANT ALL privileges ON DATABASE my_database TO my_user; 


# Now you can exit postgres command line and add pgadmin3 to do the rest of the stuff using a GUI
sudo apt-get install pgadmin3

Enjoy :)

Free PDF/A validation

And finally there is some light at the end of the tunnel.
Remember the days where all you can do in terms of validating PDF files was to use PitStop?
Then some of us who had to do with PDF/A realized that this was not enough. And here Callas entered the scene. Then Acrobat followed and may be you considered that it was finally done.
There were interesting times but let's be fair ... it is not done until the fat lady sings and those days the fat lady is something else than a big name company. PDF/A is an open format and there shall be at least one way to validate it using free software. It is not about socialism (I am also producing commercial tools) but about the free services that (archives, libraries and NGO's) that have to comply with this format but had no tools to do it.
So ... for anybody that has to validate PDF/A documents using free tools check veraPDF (http://verapdf.org/home/). Some well spent EU money :-)

Enjoy it.

Hyphens & Dashes

Great resource on hyphens and dashes. I can't think of anything more but who knows. Suggestions are welcome :-)

Kerning play

Found this on the web and enjoyed a lot:
http://type.method.ac/#
It was really fun, to see how you compare your kerning abilities with the professionals. Well I presume the actual comparison is done against the results of a kerning algorithm ... At least this is what I would do.
If I think a little I'll say that I am not that happy to be compared and loose to an algorithm but hey ... it happened to other much bigger than me :)

To tag or not to tag ... choices in digital clasification of items

A couple of days ago I had a discusion regarding the options available for organising items in a system.
Initially the system had followed the traditional way of using a tree and attaching item to the tree branches. It worked ok but new times give new ideeas and the tagging concept was put on the table.
Nothing new ... a lot of flame wars betwen the tagging fans and the rest of the world. Everyone thinks he has the answer.
Is there really an answer?
Is there really a choice?

If you think a litle .... you will see ... there is no real choice.
I think that it is more like a matter of understanding the concept of organising things, how we do it and how we visualize things.
We are naturally organising phisical things ... books, underwear, CD's, and so on.
We are used to put those things in physical repositoryes ... a box, a drawer, on a shelf etc
Those physical items and the physical repositories we use for them are right there at hand .. one book existing on that specific shelf ... one CD siting in that specific box.
There is one instace of the book and one instance of the shelf in the same place and the same time.
We make at any one time a choice to store the item in one specific place clearly identified.
When you go digital things change.
Now you work with files or with records in a database.
You can easily create a new instance of the digital item or even smarter you can use links to have the same item in different places at the same time. Wow .. that's really diferent.
The fact is that we humans are very stuborn and prefer to use concepts that are already in place. Against the usual thinking we are not very adaptable.
In order to manage things we first created the ideea of the tree ... each file was attached to a branch and we were happy with that.
We know that with a physical item this is the way to do it. Everybody considered the ideea to attach a file to multiple folders and subfolders on your disk drive as stupid and here we were ... with limited options of using a digital repository.
And here is how we had the hyerarchical way of organising digital item ... and it was usefull for organising documents based on a taxonomy.

Then somebody said that this is not flexible enough becouse it did not let us attach the item to multiple branches at the same time and a great new ideea was imported ... most probably from the library world ... let's attach to each item a number of tags that describe it in all details.
And we created the tagging system of organising electronic things and it was usefull when you needed to know what subjects a web article was covering.

Then somebody came up and said that if we use tags we cant organise the items in a hyerarchical way so we have to come back to the tree ... but he wants also the flexibility of the tags .... and here we have a mixed system.

Does it make sense for you to consider that an evolution?
From a technical point of view nothing stop a programer display the hyerarchy of the tree in a tagged interface and viceversa. Tags or trees they are the same ... you either atach the item to a property (the tree branches) or atach the property (tags) to the item. Saying that this is different is like saying that 2 + 3 is bigger than 3 + 2.

I wil not say that this is the definitive article on that subject. It will be to easy. I just hope that somebody out there will get the ideea and realise that you just change the interface and have the advantages of the other system.

The only matter in discusion is to take a closser look at your taxonomy ... this is the real issue. Is the taxonomy that you intended to use for tagging different in any way than the classes defined in the tree?


Have a good day ... and only deal with great taxonomies :)