Vajad kellegagi rääkida?
Küsi julgelt abi LasteAbi
Logi sisse

Andmekaevandamine Amazoni Raamatud SQL-iga (0)

5 VÄGA HEA
Punktid
TALLINN  UNIVERSITY  OF  TECHNOLOGY
Institute of Informatics
Department of Informatics
Chair of Foundations of Informatics
Similarity
Nth Assignment for Data Mining   Course
IDN0100
Student :
Matriculation ID:
E-mail:
Supervisor:  Innar  Liiv
TALLINN
2013
Preparation :
This  task  report describes completion of the assignment of  finding  customer  behavior  similarities via
creation  of database entries and execution of SQL queries.  First  off a PostgreSQL 9.2.2 has been
installed and a table has been created to accommodate the data from data.txt file.
-- DROP TABLE person_book;
CREATE TABLE person_book
id  serial  NOT NULL,
person_id  integer  NOT NULL,
book_id integer NOT NULL,
CONSTRAINT person_book_pkey  PRIMARY  KEY (id)
CREATE INDEX idx_book_id
ON person_book
USING btree
(book_id);
Then, the data was imported into the table. The  following  command  works  when database actually
runs on the  machine  the data file is  located  on (or  something  like this…  Sure  as  hell  the  processing
engine  couldn’t get to the file when database with the table was located on a ttu  server  and the
data.txt file on my PC).
COPY person_book(person_id, book_id) FROM 'C:\data.txt' DELIMITERS ','
CSV HEADER;
Task 1:
In  order  to determine the number of people who have  purchased   both  book X and book Y a  function
has been written in PLPGSQL. The  SELECT  sentence in the  given  function operates the following way
1) it selects all IDs of people who have purchased book Y
2) it performs inner  join  of IDs selected in the  first  step with  selection  of IDs of people who have
purchased book X
3) it counts the number of  distinct  IDs in the resulting set, thusly solving the problem
CREATE OR  REPLACE  FUNCTION joint_buyers_text(integer, integer)
RETURNS  text AS
$ BODY $
DECLARE
bookX  ALIAS  FOR $1;
bookY ALIAS FOR $2;
count  integer;
BEGIN
SELECT INTO count COUNT(DISTINCT person_id)
FROM
SELECT p.person_id  FROM person_book AS p
INNER JOIN (SELECT person_id FROM person_book where book_id=bookY)
AS b
ON p.person_id=b.person_id WHERE p.book_id=bookX
) al;
RETURN  " Books  "||bookX||"and"||bookY||" have been purchased together
by"||count||" people";
END;
$BODY$
LANGUAGE plpgsql VOLATILE  SECURITY  DEFINER
Example function  call  to test the function:
SELECT joint_buyers_text(0, 9) as  answer
Result  as presented in pgAdmin III:
Task 2:
In order to simplify the SQL query  required  for completion of the task a view has been created to  store
the list of IDs of all distinct books mentioned in data.txt
CREATE OR REPLACE VIEW unique_books AS
SELECT distinct book_id from person_book
Also, a  slightly  altered  version  of the function presented  above  has been created as it is does the
counting  required for the final  output of the task. In the function presented  below  the return type has
been altered to integer and text portion of the return  statement  eliminated.
CREATE OR REPLACE FUNCTION joint_buyers(integer,integer)
RETURNS integer
AS
$BODY$
DECLARE
bookX ALIAS FOR $1;
bookY ALIAS FOR $2;
count integer;
BEGIN
SELECT INTO count COUNT(DISTINCT person_id)
FROM
SELECT p.person_id  FROM person_book AS p
INNER JOIN (SELECT person_id FROM person_book where book_id=bookY)
AS b
ON p.person_id=b.person_id WHERE p.book_id=bookX
) al;
RETURN count;
END;
$BODY$
LANGUAGE plpgsql VOLATILE SECURITY DEFINER
For the sake of pro- activity  the  results  of this task’s resolution are saved in a table created specifically
for the  purpose  of simplifying and speeding-up the  attainment  of results for the third task of this
assignment.
CREATE TABLE joint_purchase_stat
stats_id serial NOT NULL,
bookx integer NOT NULL,
booky integer NOT NULL,
joint_buys integer NOT NULL,
CONSTRAINT joint_purchase_stat_pkey PRIMARY KEY (stats_id)
Then the above table  gets  populated with the result of a  complex  and a  demanding  select query (it has
taken over 10(!)  hours  to run on my PC as the  amount  of data to  process  was huge and it resulted in
creation of over 3,500,000 rows).
1)  Insert   clause  is commented out  since  it’s optional.
2) Selection of all  unique combinations of books (t1.book_id>t2.book_id clause  takes  care of
uniqueness of the combinations) that
3) Have both been purchased by at  least  one  person  by
3.1) selection of people who have purchased book X and
3.2)  testing  their IDs against the records of people who have bought book Y – if  there ’s at least one
matching ID  between  those selections, it  means  that  there indeed is a  person  who has purchased both
books and the combination can be regarded further and
4) the  overall  number of  joint  purchases  established .
--INSERT INTO joint_purchase_stat (bookX, bookY,joint_buys)
SELECT t1.book_id, t2.book_id, joint_buyers(t1.book_id, t2.book_id) AS
bought_together FROM unique_books AS t1  CROSS  JOIN unique_books AS t2
WHERE t1.book_id!=t2.book_id AND t1.book_id>t2.book_id and EXISTS
(SELECT t4.person_id, t4.book_id FROM person_book AS t4 INNER JOIN
(SELECT t3.person_id, t3.book_id FROM person_book WHERE
t3.book_id=t2.book_id)t3
on t3.person_id=t4.person_id WHERE t4.book_id=t1.book_id)
The data was extracted from the table into a .txt file, but  unfortunately , due to its  size  (44 MB) it’s
unreasonable to try to  upload  it for inspection. The command used to  export  the data:
COPY joint_purchase_stat(bookx, booky, joint_buys) to
'C:\ windows \temp\joint.txt'  DELIMITER ',' CSV HEADER;
Task 3:
The  solution  of the third task was  quite  straightforward compared to the  previous  two. The table
joint_purchase_stat we created in the course of completion of task 2 was used as a  basis  for
calculations of the following query, in which
1) a number of people who have purchased book X is counted and  cast  to decimal ( cause  we don’t
want to see 0% as result – it tells us  nothing , so we make sure that the data  types  participating in
division have that decimal  part ) and
2) two books are listed, alongside the nicely readable ( thanks  to  round  function, which allows to
display the designated number of decimal digits) percentage of people who have also purchased book
Y out of those who have purchased book X
3) and as we want to see the reverse  number for the combination of book X and book Y too, we add
another , almost  identical SELECT query
3.1) tweaking the name of the book overall buys are  found  for
3.2) and merge results by UNION clause.
SELECT bookx, booky,
ROUND(CAST(100*joint_buys AS
DECIMAL(10,2))/(SELECT CAST(COUNT (distinct person_id)AS decimal
(10,2)) FROM person_book WHERE book_id=bookx),2)||'%' AS percentage
FROM joint_purchase_stat
UNION
SELECT jps2.booky, jps2.bookx,
ROUND(CAST(100*joint_buys AS
decimal(10,2))/(SELECT CAST(COUNT (distinct person_id)AS decimal
(10,2)) FROM person_book WHERE book_id=jps2.booky),2)||'%' AS
percentage
FROM joint_purchase_stat AS jps2
In order to export the data to a text file a table has been created and populated by the data returned by
the above query. The file has over 7,000,000 lines in it and takes over 110 MB of  disk   space .
Andmekaevandamine Amazoni Raamatud SQL-iga #1 Andmekaevandamine Amazoni Raamatud SQL-iga #2 Andmekaevandamine Amazoni Raamatud SQL-iga #3 Andmekaevandamine Amazoni Raamatud SQL-iga #4 Andmekaevandamine Amazoni Raamatud SQL-iga #5
Punktid 50 punkti Autor soovib selle materjali allalaadimise eest saada 50 punkti.
Leheküljed ~ 5 lehte Lehekülgede arv dokumendis
Aeg2013-04-18 Kuupäev, millal dokument üles laeti
Allalaadimisi 21 laadimist Kokku alla laetud
Kommentaarid 0 arvamust Teiste kasutajate poolt lisatud kommentaarid
Autor Alysanne Õppematerjali autor

Sarnased õppematerjalid

HWII ProgrammingII
15
docx

HWII ProgrammingII

TALLINN UNIVERSITY OF TECHNOLOGY School of Information Technologies Department of Computer Systems BOOK SHOP B Homework II Supervisor: Lembit Jürimägi Tallinn 2021 Tallinn 2021 Author’s declaration of originality I hereby certify that I am the sole author of this thesis. All the used materials, references to the literature and the work of others have been referred to. This thesis has not been presented for examination anywhere else. Author: 01.06.2022 3 Table of contents 4 Table of figures 1.Task description My student code is xxxxx1 and my task is Book Shop B. For my variant I need to create two text files: “books.txt”, which stores all the information about the books, and “inventory.txt”, which stores all the records about the quantity and location of books in the shop. The

Kategoriseerimata
Veebiteenused-kordamisküsimused ja vastused kontrolltööks
40
doc

Veebiteenused (kordamisküsimused ja vastused kontrolltööks)

VEEBIT EENUSED. KONT ROLLTÖÖ. SOA o A service-oriented architecture (SOA) is an architectural pattern in computer software design in which application components provide services to other components via a communications protocol, typically over a network. The principles of service-orientation are independent of any vendor, product or technology. o Kasutab XMLi sõnumivahetuseks o Võimalus integreeride süsteeme Service-oriented architecture (SOA)  Arhitektuur, mis kasutab – teenuseid organisatsiooni integrastiooni ehitusklotsidena – komponentide taaskasutust läbi nõrga seotuse. SOA: On arhitektuur  Mingi hulga teenuste tegemine ei anna meile SOA-d.  Arhitektuur peab andma meile juhised teenuste loomiseks. SOA: Ehitatakse teenustest  Nagu objekt-orienteeritud maailmas on objekt/klass nii on SOA-s teenuses peamine komponent. 

Programmeerimine
Thesis Kivimaa August 2022
140
pdf

Thesis Kivimaa August 2022

Thesis “How is it possible to calculate IT security effectiveness?” Kristjan Kivimaa August 2022 1 Abstract In IT Security world, there is lack of available, reliable systems for measuring security levels/posture. They lack the range of quantitative measurements and easy and fast deployment, and potentially affects companies of all sizes. Readily available security standards provide qualitative security levels, but not quantitative results – that would be easily comparable. This deficiency makes it hard for companies to evaluate their security posture accurately. Absence of security metrics makes it complicated for customers to select the appropriate measures for particular security level needed. The research question for this research project is – “How is it possible to calculate IT security effectiveness?”. The aim of this research is to use this reference m

Infotehnoloogia
CPM1A Programmable Controllers Operation Manual 1784470
402
pdf

CPM1A Programmable Controllers Operation Manual 1784470

Cat. No. W317-E1-11 SYSMAC CPM1A Programmable Controllers OPERATION MANUAL CPM1A Programmable Controllers Operation Manual Revised October 2007 iv Notice: OMRON products are manufactured for use according to proper procedures by a qualified operator and only for the purposes described in this manual. The following conventions are used to indicate and classify precautions in this manual. Always heed the information provided with them. Failure to heed precautions can result in injury to people or dam- age to property. ! DANGER Indicates an imminently hazardous situation which, if not avoided, will result in death or serious injury. Additionally, there may be severe property damage. ! WARNING Indicates a potentially hazardous situation which, if not avoided, could result in death or serious inju

Automatiseerimistehnika
Vormistamine ülesanne 3
18
docx

Vormistamine ülesanne 3

Edith D. de Leeuw, Joop J. Hox, Don A. Dillman INTERNATIONAL HANDBOOK OF SURVEY METHODOLOGY ÜLESANNE Õppeaines: SISSEJUHATUS ERIALASSE Tehnoloogia ja ringmajanduse instituut Õpperühm: Juhendaja: Tallinn 2021 TABLE OF CONTENTS 2 1 THE CORNERSTONES OF SURVEY RESEARCH 1.1 Introduction The idea of conducting a survey is deceptively simple. It involves identifying a specific group or category of people and collecting information from some of them in order to gain insight into what the entire group does or thinks; however, undertaking a survey inevitably raises questions that may be difficult to answer. How many people need to be surveyed in order to be able to describe fairly accurately the entire group? How should the people be selected? What questions should be asked and how should they be posed to respondents? In addition, what data collectio

Andme-ja tekstitöötlus
Kuidas muudab mudelprojekteerimine teraskonstruktsioonide valmistamist ja ehitamist
228
pdf

Kuidas muudab mudelprojekteerimine teraskonstruktsioonide valmistamist ja ehitamist

EHITUSTEADUSKOND Ehitustootluse instituut KUIDAS MUUDAB MUDELPROJEKTEERIMINE TERASKONSTRUKTSIOONIDE PROJEKTEERIMIST, VALMISTAMIST JA EHITAMIST? HOW ARE 3D AND BIM CHANGING THE DESIGN, FABRICATION AND CONSTRUCTION OF COMPLEX STEEL STRUCTURES? EPJ 60 LT Üliõpilane: Tanel Friedenthal Juhendaja: Prof. Roode Liias Kaasjuhendaja: Prof. Carrie S. Dossick Tallinn, 2010.a. Olen koostanud lõputöö iseseisvalt. Kõik töö koostamisel kasutatud teiste autorite tööd, olulised seisukohad, kirjandusallikatest ja mujalt pärinevad andmed on viidatud. …………………………………………….. (töö autori allkiri ja kuupäev) Üliõpilase kood: 041399 Töö vastab magistritööle esitatud nõuetele …………………………………………?

Ehituskonstruktsioonid
Raamatu ajalugu - kokkuvõte
15
doc

Raamatu ajalugu - kokkuvõte

BOOKS (From Wikipedia, the free encyclopedia) A book is a set or collection of written, printed, illustrated, or blank sheets, made of paper, parchment, or other various material, usually fastened together to hinge at one side. A single sheet within a book is called a leaf, and each side of a leaf is called a page. A book produced in electronic format is known as an electronic book (e-book). Books may also refer to works of literature, or a main division of such a work. In library and information science, a book is called a monograph, to distinguish it from serial periodicals such as magazines, journals or newspaper. The body of all written works including books is literature. In novels and sometimes other types of books (e.g. biographies), a book may be divided into several large sections, also called books (Book 1, Book 2, Book 3, etc.). A lover of books is usually referred to as a bibliophile, or, more informally, a bookworm. A store where books are bought and sold is a bookstore

Inglise keel_baaskursus
Book Analog Interfacing to Embedded Microprocessors
568
pdf

Book Analog Interfacing to Embedded Microprocessors

Analog Interfacing to Embedded Microprocessors Real World Design Analog Interfacing to Embedded Microprocessors Real World Design Stuart Ball Boston Oxford Auckland Johannesburg Melbourne New Delhi Newnes is an imprint of Butterworth–Heinemann. Copyright © 2001 by Butterworth–Heinemann A member of the Reed Elsevier group All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Recognizing the importance of preserving what has been written, Butterworth–Heinemann prints its books on acid-free paper whenever possible. Library of Congress Cataloging-in-Publication Data Ball, Stuart R., 1956– Analog interfacing to embedded microprocessors : real world design / Stuart Ball. p. cm. ISBN 0-7506-7339-7 (pbk. : alk. paper) 1. Embedded computer

Mehhatroonika




Meedia

Kommentaarid (0)

Kommentaarid sellele materjalile puuduvad. Ole esimene ja kommenteeri



Sellel veebilehel kasutatakse küpsiseid. Kasutamist jätkates nõustute küpsiste ja veebilehe üldtingimustega Nõustun