Socorro Installation: Difference between revisions

From PSwiki
Jump to navigation Jump to search
Created page with " == Overview == This guide illustrate how to install Socorro with the minimum components needed for a medium sized project. Socorro is used at Mozilla to manage thousands of ..."
 
No edit summary
Line 4: Line 4:
This guide illustrate how to install Socorro with the minimum components needed for a medium sized project. Socorro is used at Mozilla to manage thousands of crashes a day and spanning multiple applications. The full architecture allows distributed loads, throttling, queuing and other optimization techniques required for such a large amount of crashes per day. Medium size projects like PlaneShift, may use a simplified architecture, which is described in this installation guide.
This guide illustrate how to install Socorro with the minimum components needed for a medium sized project. Socorro is used at Mozilla to manage thousands of crashes a day and spanning multiple applications. The full architecture allows distributed loads, throttling, queuing and other optimization techniques required for such a large amount of crashes per day. Medium size projects like PlaneShift, may use a simplified architecture, which is described in this installation guide.


This guide has been written in August 2013.
This guide has been written in August 2013. I've used a Debian Squeeze (6.0.7) server.


The official reference documentation at this point is pretty minimal and not complete enough to succeed in the install, anyway if you want to have a look it's located here: http://socorro.readthedocs.org/en/latest/installation.html
The official reference documentation at this point is pretty minimal and not complete enough to succeed in the install, anyway if you want to have a look it's located here: http://socorro.readthedocs.org/en/latest/installation.html
Line 11: Line 11:


The architecture we are going to use is here: [IMAGE TO BE ADDED]
The architecture we are going to use is here: [IMAGE TO BE ADDED]
== Components ==
This section lists the components and their usage. There is a reference to their configuration files after you have "deployed for production" (this will be explained later).
'''Collector'''
This is the application which receives the dump file from your application. This is the first piece we would like to have working.
In our minimal install it runs inside apache, so it’s not started by supervisor. There are ways to have it run separately, but we are not interested in this.
His main configuration file is /etc/socorro/collector.ini
The collector uses a filesystem to write the dumps to.
'''CrashMover (DO NOT USE)'''
Crashmover is an additional component not used in our minimal install. Just forget about him and all related config files.
Just for reference, it's started by supervisor
starting app is here: /data/socorro/application/scripts/newCrashMover.py
real app is here: /data/socorro/application/socorro/storage
'''Monitor'''
Started by supervisor
starting app is here: /data/socorro/application/socorro/monitor/monitor_app.py
real app is here: /data/socorro/application/socorro/monitor/monitor.py
Webapp
Configuration file: /data/socorro/webapp-django/crashstats/settings/local.py
middleware
Runs inside Apache with wsgi
Real app is here : /data/socorro/application/socorro/middleware/middleware_app.py
Configuration file: /etc/socorro/middleware.ini


== Directory structure ==
== Directory structure ==


Before proceeding with the installation, it's important you understand the directory structure which will be created by the standard install, so you can troubleshoot more easily the installation if needed.
'''/home/planeshift/socorro'''


== Components ==
This is where I’ve checked out the sources and did the initial installation. This is the initial enviroment where everything gets built and tested. It's called the "development environment". When the installation is completed, you will deploy the necessary pieces to the other directories with the procude "deploying in production" (see below). After the production deployment none of the files (including configs) in this dir will be used anymore.
 
 
'''/etc/supervisor/conf.d'''
 
Contains supervisor ini files, like 1-socorro-processor.conf  2-socorro-crashmover.conf  3-socorro-monitor.conf
those just point the supervisor to execute the apps in /data/socorro/application/scripts/
Processor: /data/socorro/application/scripts/startProcessor.py
CrashMover: /data/socorro/application/scripts/newCrashMover.py
Monitor: /data/socorro/application/socorro/monitor/monitor_app.py
 
/etc/socorro
contains all .ini files like: collector.ini crashmover.ini monitor.ini processor.ini
 
/home/socorro
Contains only the uploaded minidumps, no configuration files are present
 
/data/socorro
Contains all applications as executed in the production environment
Contains all configuration files under /data/socorro/application/config like collector.ini crashmover.ini monitor.ini processor.ini
Other configuration files seems to be present in /data/socorro/application/scripts/config in terms of .py


/var/log/socorro
Contains all the logs from the applications





Revision as of 11:08, 28 August 2013

Overview

This guide illustrate how to install Socorro with the minimum components needed for a medium sized project. Socorro is used at Mozilla to manage thousands of crashes a day and spanning multiple applications. The full architecture allows distributed loads, throttling, queuing and other optimization techniques required for such a large amount of crashes per day. Medium size projects like PlaneShift, may use a simplified architecture, which is described in this installation guide.

This guide has been written in August 2013. I've used a Debian Squeeze (6.0.7) server.

The official reference documentation at this point is pretty minimal and not complete enough to succeed in the install, anyway if you want to have a look it's located here: http://socorro.readthedocs.org/en/latest/installation.html

The full architecture schema is here: http://socorro.readthedocs.org/en/latest/generalarchitecture.html

The architecture we are going to use is here: [IMAGE TO BE ADDED]


Components

This section lists the components and their usage. There is a reference to their configuration files after you have "deployed for production" (this will be explained later).

Collector

This is the application which receives the dump file from your application. This is the first piece we would like to have working. In our minimal install it runs inside apache, so it’s not started by supervisor. There are ways to have it run separately, but we are not interested in this. His main configuration file is /etc/socorro/collector.ini The collector uses a filesystem to write the dumps to.


CrashMover (DO NOT USE)

Crashmover is an additional component not used in our minimal install. Just forget about him and all related config files.

Just for reference, it's started by supervisor starting app is here: /data/socorro/application/scripts/newCrashMover.py real app is here: /data/socorro/application/socorro/storage

Monitor

Started by supervisor starting app is here: /data/socorro/application/socorro/monitor/monitor_app.py real app is here: /data/socorro/application/socorro/monitor/monitor.py


Webapp Configuration file: /data/socorro/webapp-django/crashstats/settings/local.py

middleware Runs inside Apache with wsgi Real app is here : /data/socorro/application/socorro/middleware/middleware_app.py Configuration file: /etc/socorro/middleware.ini


Directory structure

Before proceeding with the installation, it's important you understand the directory structure which will be created by the standard install, so you can troubleshoot more easily the installation if needed.

/home/planeshift/socorro

This is where I’ve checked out the sources and did the initial installation. This is the initial enviroment where everything gets built and tested. It's called the "development environment". When the installation is completed, you will deploy the necessary pieces to the other directories with the procude "deploying in production" (see below). After the production deployment none of the files (including configs) in this dir will be used anymore.


/etc/supervisor/conf.d

Contains supervisor ini files, like 1-socorro-processor.conf 2-socorro-crashmover.conf 3-socorro-monitor.conf those just point the supervisor to execute the apps in /data/socorro/application/scripts/ Processor: /data/socorro/application/scripts/startProcessor.py CrashMover: /data/socorro/application/scripts/newCrashMover.py Monitor: /data/socorro/application/socorro/monitor/monitor_app.py

/etc/socorro contains all .ini files like: collector.ini crashmover.ini monitor.ini processor.ini

/home/socorro Contains only the uploaded minidumps, no configuration files are present

/data/socorro Contains all applications as executed in the production environment Contains all configuration files under /data/socorro/application/config like collector.ini crashmover.ini monitor.ini processor.ini Other configuration files seems to be present in /data/socorro/application/scripts/config in terms of .py

/var/log/socorro Contains all the logs from the applications


How to proceed

> apt-get install python-software-properties

 Setting up python-apt-common (0.7.100.1+squeeze1) ...
 Setting up python-apt (0.7.100.1+squeeze1) ...
 Setting up iso-codes (3.23-1) ...
 Setting up lsb-release (3.2-23.2squeeze1) ...
 Setting up python-gnupginterface (0.3.2-9.1) ...
 Setting up unattended-upgrades (0.62.2) ...
 Setting up python-software-properties (0.60.debian-3) ...

> apt-get install build-essential subversion (already present)

> apt-get install libpq-dev python-virtualenv python-dev

 Setting up libpython2.6 (2.6.6-8+b1) ...
 Setting up python2.6-dev (2.6.6-8+b1) ...
 Setting up python-dev (2.6.6-3+squeeze7) ...
 Setting up python-pkg-resources (0.6.14-4) ...
 Setting up python-setuptools (0.6.14-4) ...
 Setting up python-pip (0.7.2-1) ...
 Setting up python-virtualenv (1.4.9-3squeeze1) ...

Install postgres 9.2

 For squeeze only, update the repos to have postgres 9.2 (default is 8.4, which is too old to work with socorro because it doesn't have JSON support)
 Create /etc/apt/sources.list.d/pgdg.list and add this line:
 deb http://apt.postgresql.org/pub/repos/apt/ squeeze-pgdg main
 > wget --quiet -O - http://apt.postgresql.org/pub/repos/apt/ACCC4CF8.asc | sudo apt-key add -
 > sudo apt-get update
 > apt-get install postgresql-9.2 postgresql-plperl-9.2 postgresql-contrib-9.2 postgresql-server-dev-9.2

> apt-get install rsync python2.6 python2.6-dev libxslt1-dev git-core mercurial

> apt-get install python-psycopg2

> apt-get install libsasl2-dev

Ensure that timezone is set to UTC

> vi /etc/postgresql/9.2/main/postgresql.conf

 timezone = 'UTC'

Create postgres superuser (as root)

 > su - postgres -c "createuser -s planeshift"

Remove security layer for postgres

 Edit /etc/postgresql/9.1/main/pg_hba.conf and change the following line from 'peer' to 'trust':
 
 host    all         all         127.0.0.1/32          peer
 
 host    all         all         127.0.0.1/32          trust
 > service postgresql restart

Postgres useful commands:

 > /etc/init.d/postgresql start
 > /etc/init.d/postgresql stop
 > psql -U planeshift -d breakpad 
 breakpad# \dt (show tables)
 breakpad# \d products (describes table products)

Get socorro release (as planeshift)

 > cd
 > git clone --depth=1 https://github.com/mozilla/socorro socorro
 > cd socorro
 > git fetch origin --tags --depth=1
 > git checkout 56 (chosen release 56 as the stable one)

Node/Nmp is required, install it:

 > apt-get install openssl libssl-dev
 > git clone https://github.com/joyent/node.git
 > cd node
 > git tag
 > git checkout v0.9.12
 > ./configure --openssl-libpath=/usr/lib/ssl
 > make
 > make test
 > sudo make install
 > node -v # it's alive!

Update python-pip (as root):

 > pip install --upgrade pip
 > /home/planeshift/socorro/socorro-virtualenv/bin/pip install --upgrade pip

Install lessc

 > npm install less -g

From inside the Socorro checkout

 > export PATH=$PATH:/usr/lib/postgresql/9.2/bin
 > make json_enhancements_pg_extension
 > make test
 > make minidump_stackwalk

Setup environment

 > make bootstrap-dev

Populate PostgreSQL Database

 > cd socorro
 > psql -f sql/roles.sql postgres

as user planeshift

 > (?) make bootstrap-dev
 > . socorro-virtualenv/bin/activate
 > export PYTHONPATH=.
 > DOESNT WORK YET ./socorro/external/postgresql/setupdb_app.py --database_name=breakpad --database_superusername=planeshift
 > ./socorro/external/postgresql/setupdb_app.py --database_name=breakpad --fakedata --dropdb --database_superusername=breakpad_rw --database_superuserpassword=bPassword
 > ./socorro/external/postgresql/setupdb_app.py --database_name=breakpad --fakedata --database_superusername=planeshift --dropdb
 > python socorro/cron/crontabber.py --job=weekly-reports-partitions --force  >> FAILS FAILS FAILS

Copy default config files

 > cp config/collector.ini-dist config/collector.ini
 > cp config/processor.ini-dist config/processor.ini
 > cp config/monitor.ini-dist config/monitor.ini
 > cp config/middleware.ini-dist config/middleware.ini

Prepare for production usage

 > apt-get install supervisor rsyslog libapache2-mod-wsgi memcached
 > mkdir /etc/socorro
 > mkdir /var/log/socorro
 > mkdir -p /data/socorro
 > useradd socorro
 > chown socorro:socorro /var/log/socorro
 > mkdir -p /home/socorro/primaryCrashStore /home/socorro/fallback /home/socorro/persistent
 > chown www-data:socorro /home/socorro/primaryCrashStore /home/socorro/fallback
 > chmod 2775 /home/socorro/primaryCrashStore /home/socorro/fallback

Generate your own /etc/socorro/collector.ini

 > login as socorro user
 > export PYTHONPATH=/data/socorro/application:/data/socorro/thirdparty
 > python /data/socorro/application/socorro/collector/collector_app.py --admin.conf=/etc/socorro/collector.ini --help
 > python /data/socorro/application/socorro/collector/collector_app.py --admin.conf=/etc/socorro/collector.ini --admin.dump_conf=/tmp/c1.ini
 > cp /tmp/c1.ini /etc/socorro/collector.ini

Generate your own /etc/socorro/processor.ini

 > login as socorro user
 > export PYTHONPATH=/data/socorro/application:/data/socorro/thirdparty
 > python /data/socorro/application/socorro/processor/processor_app.py --admin.conf=/etc/socorro/processor.ini --help
 > chown www-data:socorro /home/socorro [NOT NEEDED? WAS root:root]
 > python /data/socorro/application/socorro/processor/processor_app.py --admin.conf=/etc/socorro/processor.ini --source.crashstorage_class=socorro.external.fs.crashstorage.FSDatedRadixTreeStorage --admin.dump_conf=/tmp/p1.ini
 > edit p1.ini file manually and delete everything inside [c_signature]
 > python /data/socorro/application/socorro/processor/processor_app.py --admin.conf=/tmp/p1.ini --admin.dump_conf=/tmp/p2.ini --destination.storage_classes='socorro.external.postgresql.crashstorage.PostgreSQLCrashStorage, socorro.external.fs.crashstorage.FSRadixTreeStorage'
 > edit p2.ini file manually and delete everything inside [c_signature]
 > python /data/socorro/application/socorro/processor/processor_app.py --admin.conf=/tmp/p2.ini --admin.dump_conf=/tmp/p3.ini --destination.storage1.crashstorage_class=socorro.external.fs.crashstorage.FSRadixTreeStorage
 > edit p3.ini file manually and delete everything inside [c_signature]
 > edit p3.ini and set fs_root=/home/socorro/primaryCrashStore . There should be two places, one under [destination]storage1 and one under [source] 

Generate your own /etc/socorro/middleware.ini

 > python /data/socorro/application/socorro/middleware/middleware_app.py --admin.conf=/etc/socorro/middleware.ini --help
 > python /data/socorro/application/socorro/middleware/middleware_app.py --admin.conf=/etc/socorro/middleware.ini --admin.dump_conf=/tmp/m1.ini
 > edit /tmp/m1 and change: filesystem_class='socorro.external.fs.crashstorage.FSDatedRadixTreeStorage'
 > comment out 'platforms', 'implementation_list' and 'service_overrides' variables as those are printed wrongly by the dumper

Edit the database (as user planeshift)

 > psql -U planeshift -d breakpad
 > INSERT INTO products VALUES ('PlaneShift','0.1','0.1','PlaneShift','0');
 > INSERT INTO product_versions VALUES (17,'PlaneShift','0.5','0.5.10','0.5.10',0,'0.5.10','2013-08-23','2013-12-23','f','Release','f','f',null);
 > DELETE from products where product_name='PlaneShift';
 > DELETE from product_versions where product_name='PlaneShift';
 > DELETE from releases_raw where product_name='PlaneShift';
 > DELETE from product_productid_map where product_name='PlaneShift';
 > DELETE from product_release_channels where product_name='PlaneShift';
 
 > SELECT add_new_product('PlaneShift', '0.5.10','12345','PlaneShift',1);
 > SELECT add_new_release ('PlaneShift','0.5.10','Release',201305051111,'Windows',1,'release','f','f');
 > select update_product_versions(200);  // generates products version info for older releases, 200 days.

Install Socorro for production

 > cd /home/planeshift/socorro
 > make install
 (as root)
 > cp config/*.ini /etc/socorro/
 edit /etc/socorro/collector.ini and uncomment these lines:
 wsgi_server_class='socorro.webapi.servers.ApacheModWSGI'
 fs_root='/home/socorro/primaryCrashStore'
 crashstorage_class='socorro.external.fs.crashstorage.FSDatedRadixTreeStorage'

Cronjobs for Socorro

 > cp scripts/crons/socorrorc /etc/socorro/
 edit crontab -e
 > */5 * * * * socorro /data/socorro/application/scripts/crons/crontabber.sh

Start daemons

 > cp puppet/files/etc_supervisor/*.conf /etc/supervisor/conf.d/
 > /etc/init.d/supervisor stop
 > /etc/init.d/supervisor start

Configure Apache

 > cp puppet/files/etc_apache2_sites-available/{crash-reports,crash-stats,socorro-api} /etc/apache2/sites-available


Activate apache modules

 > a2enmod headers
 > a2enmod proxy
 > a2enmod rewrite
 > /etc/init.d/apache2 restart

Set access rights on cache dir

 > chmod -R 777 /data/socorro/webapp-django/static/CACHE/

Create a screen startup file “launchScorro” that'll be used for the Socorro scripts:

 cd /home/planeshift/socorro
 . socorro-virtualenv/bin/activate
 export PYTHONPATH=.
 startup_message off
 autodetach on
 defscrollback 10000
 termcap xterm 'Co#256:AB=\E[48;5;%dm:AF=\E[38;5;%dm'
 screen -S processor python socorro/processor/processor_app.py --admin.conf=./config/processor.ini
 screen -S monitor python socorro/monitor/monitor_app.py --admin.conf=./config/monitor.ini
 screen -S middleware python socorro/middleware/middleware_app.py --admin.conf=config/middleware.ini
 [NOT NEEDED as it runs inside Apache] screen -S collector python socorro/collector/collector_app.py --admin.conf=./config/collector.ini


Minidump simulation

 > cd /data/socorro/stackwalk/bin/
 > ./minidump_upload -p Planeshift -v 0.5.12 6cc10361-c469-1504-1d91efef-7b8e750c.dmp http://194.116.72.94/crash-reports/submit


Configure WebAPP

Edit configuration file: /data/socorro/webapp-django/crashstats/settings/local.py

DEFAULT_PRODUCT = 'PlaneShift'


Access the web UI

 http://194.116.72.94/crash-stats/home/products/WaterWolf


Troubleshooting

1) http://194.116.72.94/crash-stats/home/frontpage_json?product=PlaneShift&versions=0.5.12 unable to open database file

Request Method: GET Request URL: http://194.116.72.94/crash-stats/home/frontpage_json?product=PlaneShift&versions=0.5.12 Django Version: 1.4.5 Exception Type: OperationalError Exception Value:

unable to open database file

Exception Location: /data/socorro/webapp-django/vendor/lib/python/django/db/backends/sqlite3/base.py in _sqlite_create_connection, line 278

Answer: this is for authenticated sessions, it does not need to be the socorro postgres db but needs to be somewhere with write access :) either sqlite db or postgres/mysql/anything django supports

Solution: edit /data/socorro/webapp-django/crashstats/settings/base.py for database setting

'NAME': '/home/socorro/sqlite.crashstats.db'

 > cp /data/socorro/webapp-django/sqlite.crashstats.db /home/socorro
 > chown www-data:socorro /home/socorro/sqlite.crashstats.db