Saturday, May 31, 2008

From MySQL to PostgreSQL

Why? In brief, it is said that mysql is not as stable as postgresql. Postgresql or pgsql focuses on a single database engine as compared to mysql which has a pluggable engine architecture and has multiple engines. Also postgresql is well designed as compared to mysql. psql console is much better than mysql console (you will realize it when you use it). It is supposed to be much more scalable and have better performance than mysql. Pgsql uses a more standard sql language as compared to mysql.

Both have fulltext search capabilities. Though it is said that the fulltext search of pgsql is better than mysql, but i still have to look into it. And the most important of all - pgsql is a full fledged RDBMS, whereas mysql (using the default MyISAM engine) is a DBMS. Data integrity is better in pgsql as compared to mysql. Also i have seen lots of corrupt tables in mysql(might be because i have used mysql throughout my career), and have heard(on the web) that mysql is more crash prone than pgsql.

The final question then comes down to speed. Which one is faster? Logically, since mysql is a DBMS and it does not maintain foreign key relations, it should be faster than pgsql. I will benchmark and blog the results sometime later.

Basically, pgsql is more stable & reliable than mysql. Pgsql has a rich set of features and data integrity is well maintained in pgsql.

Lets look at some of the steps you need to follow if you decide to switch to pgsql from mysql:


  • First of all, down the database engine from www.postgresql.org. Untar it and compile it.

    $ tar -xvjf postgresql-8.3.1.tar.bz2
    $ cd postgresql-8.3.1
    $ ./configure --with-openssl --enable-thread-safety --with-libxml
    (check out the options that you need using ./configure --help)
    $ make
    $ sudo make install

    By default this will install pgsql in /usr/local directory

  • Next, create a user postgres and the database directory for pgsql
    $ adduser postgres
    $ mkdir /usr/local/pgsql/data
    $ chown postgres.postgres /usr/local/pgsql/data

  • Create the pgsql database:
    Firstly, log into mysql and do
    mysql> show variables like '%character';
    +--------------------------+-----------------------------------------------------------------+
    | Variable_name | Value |
    +--------------------------+-----------------------------------------------------------------+
    | character_set_client | latin1 |
    | character_set_connection | latin1 |
    | character_set_database | latin1 |
    | character_set_filesystem | binary |
    | character_set_results | latin1 |
    | character_set_server | latin1 |
    | character_set_system | utf8 |
    | character_sets_dir | /usr/local/mysql-5.1.24-rc-linux-x86_64-glibc23/share/charsets/ |
    +--------------------------+-----------------------------------------------------------------+

    You will see here that the character set for the database & server is latin1. Even though the default database is utf8. If you create the pgsql database using the default command, it will create a utf8 database and you will face problems importing your data into it.

    So, create a latin1 database for pgsql
    $ su - postgres
    $ initdb -D /usr/local/pgsql/data -E latin1 --locale=C

  • Now you have postgresql installed and you can create your own databases and tables. But before proceeding forward, do psql from user postgres's shell.

    start pgsql:

    $ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >~/logfile 2>&1 &

    $ psql
    Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

    Type: \copyright for distribution terms
    \h for help with SQL commands
    \? for help with psql commands
    \g or terminate with semicolon to execute query
    \q to quit

    postgres=#


    The =# says that you are logged in as a super user. To login as another user, create another user and give him required privileges. And then su to the user and login to pgsql from that user.

    postgres=# CREATE USER jayant WITH CREATEDB INHERIT LOGIN;
    postgres=# \q
    $ su - jayant
    $ createdb test
    $ psql test
    test=>


    Here => says that i am not a super user. Now i can create my tables in the test database or create more databases.

    For help on how to create users do

    test=> \h CREATE USER

    And it will give you the syntax for creating a user in postgres

  • To convert your table from mysql format to pgsql format, just dump the table and data separately from mysql using the mysqldump --no-data and mysqldump --no-create-info respectively.

    And you will get two files <mysql_table>.sql & <mysql_table_data>.sql

    download a perl script mysql2pgsql.perl which will be able to convert the sql for your table into appropriate format for pgsql.

  • Now load the table and data into pgsql

    $ psql -f <mysql_table>.sql
    $ psql -f <mysql_table_data>.sql



It looks simple, but it is not that simple. Some points that need to be remembered while using pgsql


  • auto_increment column is defined using the SERIAL word. So, to create an auto_increment primary key column the syntax would be

    CREATE TABLE MYTABLE(ID SERIAL NOT NULL PRIMARY KEY, ....);

    This would also create a sequence mytable_id_seq.

  • While importing data from the mysql dump you may get some warning about backslash escapes in your sql. To remove this warning you will have to set the escape_string_warning to off.

    edit your postgresql.conf file which can be found in the /usr/local/pgsql/data directory and change the variable to off. Now restart pgsql.

  • To start and stop pgsql following commands can be used

    start pgsql
    $ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >~/logfile 2>&1 &

    stop pgsql
    $ /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data

  • Another thing to note is that you dont have custom queries like 'insert ignore into', 'replace into' or 'insert into ... on duplicate key...' in pgsql. You will need to create rules on the table to handle these cases.

    test=> \h create rule
    Command: CREATE RULE
    Description: define a new rewrite rule
    Syntax:
    CREATE [ OR REPLACE ] RULE name AS ON event
    TO table [ WHERE condition ]
    DO [ ALSO | INSTEAD ] { NOTHING | command | ( command ; command ... ) }


    So for exampe to create an insert ignore into type of rule, the following rule needs to be created.

    test=> CREATE OR REPLACE RULE "insert_ignore_mytable" AS ON INSERT TO "mytable" WHERE EXISTS (SELECT 1 FROM "mytable" WHERE id = NEW.id) DO INSTEAD NOTHING;

  • Remember SHOW PROCESSLIST of mysql which used to list down all the processes.

    To list down all the processes for pgsql following command needs to be run

    test=> select * from pg_stat_activity;



We will looking more and more into pgsql.

No comments: