Deividson on Databases: Views
PostgreSQL - "The world's most advanced open source database" as stated at http://www.postgresql.org/ - is packed with features that can help you immensely when developing applications. In this series, we will see three very important, often under-utilized options that have a broad range of uses. They are called Views, Stored Procedures, and Triggers.
We will use real-world examples, with lots of code to help you understand those features.
Views
A View is a pre-selection of data that can be accessed by an SQL query. It minimizes the need for complex (sometimes very complex) SQL in your application and is often used to retrieve data for standard reports or other regularly-fetched data sets.
As an example, let's assume that you have the following tables in your database (you can use these SQL commands to create your own test suite):
create table cpu ( cpu_id serial primary key, cpu_type text ); create table video ( video_id serial primary key, video_type text ); create table computer ( computer_id serial primary key, computer_ram integer, cpu_id integer references cpu(cpu_id), video_id integer references video(video_id) );
And the following data inside it:
insert into cpu(cpu_type) values('Intel P4 Dual Core D'); insert into cpu(cpu_type) values('AMD Athlon'); insert into video(video_type) values('Geforce 8600GT'); insert into video(video_type) values('Radeon 9550'); insert into computer values (0, 512, 1, 2); insert into computer values (1, 1024, 2, 1); insert into computer values (2, 512, 2, 2);
That's a pretty simple database with 3 tables - CPUs, video types, and computers, tied up with Foreign Keys. Now let's say you need to fetch the computer data from those tables - but you don't want the CPU and Video IDs, just the description. You can use the following SQL:
select a.computer_id, a.computer_ram, b.cpu_type, c.video_type from computer a, cpu b, video c where (a.cpu_id=b.cpu_id) AND (a.video_id=c.video_id);
That will return the following:
0 512 "Intel P4 Dual Core D" "Radeon 9550" 1 1024 "AMD Athlon" "Geforce 8600GT" 2 512 "AMD Athlon" "Radeon 9550"
Now if you use that same SQL a lot, you can simplify calling it by defining a view, like this:
create or replace view computer_full(computer_id, computer_ram, cpu_type, video_type) as ( select a.computer_id, a.computer_ram, b.cpu_type, c.video_type from computer a, cpu b, video c where (a.cpu_id=b.cpu_id) AND (a.video_id=c.video_id) );
That's "create [or replace] view (view name) [(returned field aliases)]
as (sql)"
. The "or replace"
is useful for testing: it
replaces the view if it already exists, so you don't need to DROP it
before re-creating. The field aliases are optional, and if omitted the
view will return the field names.
Now anywhere you need that same data, you can have it by executing
select * from computer_full;
The result will be the same we have above. You can even use WHERE clauses too, like:
Select * from computer_full where computer_id=1
The result will be:
1 1024 "AMD Athlon" "Geforce 8600GT"
That helps, not only by simplifying your queries, but also makes it easier in case you find a bug on your code, or a better way to do it - you simply change the View, and in every place that you called it, the new code will be executed. Let's change our view to a more professional way of selecting data - junctions. Junctions are a better way of fetching data from multiple tables, being faster than the labeling tables method we used above. There's several ways of doing junctions, and on our example we can choose between 2 of them:
create or replace view computer_full as ( select computer_id, computer_ram, cpu_type, video_type from computer a right join cpu b on (a.cpu_id=b.cpu_id) right join video c on (a.video_id=c.video_id) );
and
create or replace view computer_full as ( select computer_id, computer_ram, cpu_type, video_type from computer natural right join cpu natural right join video );
The first option, using "on" is used when the Column name you are using to connect the two tables are different ('on computer_cpu_id=cpu_id'), and the second one (natural join) is used when both column names are identical. That's the reason we had to label our tables (computer a, cpu b, video c) in our first example - using 'on cpu_id=cpu_id' would be ambiguous. Try to always use Natural Joins whenever possible - they are easier to understand and usually faster.
Temporary ViewsTemporary Views are, just as the name says, temporary views. You can use then just like views, but they are not supposed to be static (i.e., always in your database.) A quick example of a temp view is:
create temp view products (model, price) as select model, price from pc union select model, price from laptop union select model, price from printer
and dispose of it with
drop view test
Temporary views can be useful for storing the code of dynamic reports or similar tasks.
Conclusion
Views are more of a convenience than a speed feature, but using them can be helpful when developing systems.
There's a small .sql file attached to this article - if you open it in PGAdmin, there's code to create test data for our sample tables (50000 CPUs and Videos, 500000 computers). You can use it to see how Views and normal SQL commands behave, speed- wise, with a large dataset.
Well, that's it for this article, I hope it was useful - next month, we'll cover Stored Procedures!
A special thanks goes out to teacher Saulo Benvenutti, great Database teacher - it was a pleasure to be in your classes.
Talkback: Discuss this article with The Answer Gang
Deividson was born in União da Vitória, PR, Brazil, on 14/04/1984. He became interested in computing when he was still a kid, and started to code when he was 12 years old. He is a graduate in Information Systems and is finishing his specialization in Networks and Web Development. He codes in several languages, including C/C++/C#, PHP, Visual Basic, Object Pascal and others.
Deividson works in Porto União's Town Hall as a Computer Technician, and specializes in Web and Desktop system development, and Database/Network Maintenance.