Detailed instructions for installing pache Superset on ubuntu EC2 virtual machine (AWS).
Superset
Apache SuperSet is an Open Source data visualization tool that can be used to represent data graphically. The Superset was originally created by AirBnB and later released to the Apache community. Apache Superset is developed in Python language and uses Flask Framework for all web interactions. Superset supports the majority of RDMBS through SQL Alchemy.
EC2
- Open the console, select EC2
- Launch instance
- Ubuntu free-tier eligable – t2.micro
- Memory choice (I chose 10GB)
- Security group – set all traffic in inbound rude
- Assign key to instance (for ssh)
- Launch it
Setting
Because ubuntu 22.04 sets python 3.10 as default, but Superset doesn’t support python 3.10 version yet, so we will create virtual environment running with python 3.9:
1 2 3 4 5 6 7 | sudo apt-get update sudo apt install software-properties-common sudo add-apt-repository ppa:deadsnakes/ppa sudo apt install python3.9 # Nhấn "y" để cho phép quá trình tiếp tục python3.9 --version |
Install virtual environment on python 3.9 platform:
1 2 3 4 5 | sudo apt <span class="token operator">-</span> <span class="token keyword">get</span> install python3. <span class="token number">9</span> <span class="token operator">-</span> pip sudo apt install python3. <span class="token number">9</span> <span class="token operator">-</span> venv python3. <span class="token number">9</span> <span class="token operator">-</span> m venv name_env source name_env <span class="token operator">/</span> bin <span class="token operator">/</span> activate |
Install the libraries in the requirements.txt . file
1 2 | pip3 install <span class="token operator">-</span> r requirements.txt |
Contents of the requirements.txt . file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | aiohttp==3.8.3 aiosignal==1.3.1 alembic==1.8.1 amqp==5.1.1 apache-superset==2.0.0 apispec==3.3.2 async-generator==1.10 async-timeout==4.0.2 attrs==22.1.0 Babel==2.11.0 backoff==2.2.1 billiard==3.6.4.0 bleach==3.3.1 Brotli==1.0.9 cachelib==0.4.1 celery==5.2.7 certifi==2022.9.24 cffi==1.15.1 charset-normalizer==2.1.1 click==8.1.3 click-didyoumean==0.3.0 click-plugins==1.1.1 click-repl==0.2.0 colorama==0.4.6 convertdate==2.4.0 cron-descriptor==1.2.32 croniter==1.3.8 cryptography==38.0.3 deprecation==2.1.0 dnspython==2.2.1 email-validator==1.3.0 exceptiongroup==1.0.4 Flask==2.0.3 Flask-AppBuilder==4.1.6 Flask-Babel==2.0.0 Flask-Caching==1.11.1 Flask-Compress==1.13 Flask-JWT-Extended==4.4.4 Flask-Login==0.6.2 Flask-Migrate==4.0.0 Flask-SQLAlchemy==2.5.1 flask-talisman==1.0.0 Flask-WTF==1.0.1 frozenlist==1.3.3 func-timeout==4.3.5 geographiclib==2.0 geopy==2.3.0 graphlib-backport==1.0.3 gunicorn==20.1.0 h11==0.14.0 hashids==1.3.1 holidays==0.10.3 humanize==4.4.0 idna==3.4 importlib-metadata==5.0.0 isodate==0.6.1 itsdangerous==2.1.2 Jinja2==3.1.2 jsonschema==4.17.1 kombu==5.2.4 korean-lunar-calendar==0.3.1 Mako==1.2.4 Markdown==3.4.1 MarkupSafe==2.1.1 marshmallow==3.19.0 marshmallow-enum==1.5.1 marshmallow-sqlalchemy==0.26.1 msgpack==1.0.4 multidict==6.0.2 numpy==1.22.1 outcome==1.2.0 packaging==21.3 pandas==1.3.5 parsedatetime==2.6 pgsanity==0.2.9 Pillow==9.3.0 polyline==1.4.0 prison==0.2.1 prompt-toolkit==3.0.33 pyarrow==5.0.0 pycparser==2.21 PyJWT==2.6.0 PyMeeus==0.5.11 pyparsing==3.0.9 pyrsistent==0.19.2 PySocks==1.7.1 python-dateutil==2.8.2 python-dotenv==0.21.0 python-geohash==0.8.5 pytz==2022.6 PyYAML==6.0 redis==4.3.5 selenium==4.6.0 simplejson==3.18.0 six==1.16.0 slackclient==2.5.0 sniffio==1.3.0 sortedcontainers==2.4.0 SQLAlchemy==1.3.24 SQLAlchemy-Utils==0.37.9 sqlparse==0.3.0 tabulate==0.8.9 trio==0.22.0 trio-websocket==0.9.2 typing-extensions==3.10.0.2 urllib3==1.26.13 vine==5.0.0 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==2.0.3 wsproto==1.2.0 WTForms==2.3.3 WTForms-Ext==0.5 WTForms-JSON==0.3.5 yarl==1.8.1 zipp==3.10.0 |
Run the program
Create an Admin account (you will be prompted to set your username, last name, first name, email set password)
1 2 3 | export FLASK_APP=superset flask fab create-admin |
Initialize the database:
1 2 | superset db upgrade |
Load some sample data
1 2 | superset load_examples |
Create default roles and permissions
1 2 | superset init |
Start the web server on port 8080 with the EC2 virtual machine’s Public IPv4 DNS
1 2 3 4 5 6 | superset run -h your_Public_IPv4_DNS -p 5000 # hoặc superset run -h 0.0.0.0 # vô bằng Public IPv4 DNS EC2 của bạn với port mặc định là 5000 |
Sign in and use it!!!
References:
Beginners Guide how to Install Superset (Opensource BI platform) on EC2 AWS instance