Feeds:
Posts
Comments

window functions allow one to look at the previous values or next values of a column. for example if i want to subtract the previous row value from the current row value then window functions lag and lead can be used.

let us take up an example.

first create a text file containing numbers 1 through 9 with a single number on each line like so


1

2

3

.

.

9

call the file data.txt.

next we create a table in hive.


create table foo (a int);

next we load our data.txt file in to this created table using the following


load data local inpath 'data.txt' overwrite into table foo;

we want to access previous and next values over column ‘a’ note therefore the over clause in the following query


select lag(a, 1) over (order by a) as previous, a, lead(a, 1) over (order by a) as next from foo;

which outputs the following:


previous a next
NULL 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 NULL

note how the previous and next values are NULL at the edge cases. You could specify a value in such cases as in the following query which specifies 0.


select lag(a, 1, 0) over (order by a) as previous, a, lead(a, 1, 0) over (order by a) as next from foo;

which outputs the following:


previous a next
0 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 0

lag(a, 1) will fetch the previous value while lag(a, 2) will fetch the previous to previous value.

case matters.

had created table as such

create table foo(

a int,

b string

) stored as orc tblproperties(“orc.compress”=”snappy”);

but when i went to populate the table using an “insert overwrite table foo select * from …” statement then i faced an error.

turns out i should have used “SNAPPY” instead of “snappy”. case matters.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC

if you do not want to apply markdown formatting rules to a chunk of text then wrap them in following tags.

opening tag: “`text i.e. 3 backtics followed by the word text

closing tag: “` 3 backticks

this will give you unformatted text and displays it in exactly the form it was typed in.

import datetime

print(datetime.datetime.strftime(datetime.datetime.now(), ‘%c’))

hope this becomes the number 1 search result when people type this post’s title in to google.

i like to sort my vifm buffers by modification time but don’t like the fact that all the directories are grouped first and then the files next. i wanted to sort the buffer so that there is no distinction between directories and files when sorting by modification time. here is how you can achieve it.

issue the colon(:) to access the vifm command prompt

then issue the following command

set sort=mtime,dir

mtime is for modification time and dir for directories.

i have mapped that to the leader key+s combo in vifmrc file thusly

nmap ,s :set sort=mtime,dir<cr>

matplotlib default figsize

The default value is [8.0, 6.0] which can be changed of course. To know all the default values just inspect the value of ‘rcParams


import matplotlib.pyplot as plt

print(plt.rcParams) # to examine all values

print(plt.rcParams.get('figure.figsize')

on trying to plot any thing in ipython using matplotlib i got the following error

This application failed to start because it could not find or load the Qt platform plugin “xcb”.

Available platform plugins are: eglfs, kms, linuxfb, minimal, minimalegl, offscreen, xcb.

Reinstalling the application may fix this problem.
Aborted (core dumped)

for example the following command will produce the error

ipython -c 'import pylab; pylab.plot()'

matplotlib backend: Qt5Agg

uname -srvmo :: Linux 3.16.1-1-ARCH #1 SMP PREEMPT Thu Aug 14 07:40:19 CEST 2014 x86_64 GNU/Linux

$ pacman -Q ipython python-matplotlib
ipython 2.2.0-1
python-matplotlib 1.4.0-2

solution :: install libxkbcommon-x11

you may also get the error

You can’t change the terminal in multiplot mode

the following code WILL produce the error:

set multiplot layout 1,2
set term postscript eps enhanced
set out 'a.eps'
plot sin(x)
plot cos(x)

this is an annoyance but just set the term and out variables before you set the multiplot command to get rid of the error message.

set term postscript eps enhanced
set out 'a.eps'
set multiplot layout 1,2 # set 'multiplot' after 'term' and 'out'
plot sin(x)
plot cos(x)

vifm crashes on launch

on archlinux with openbox window manager and no desktop environment

Linux  3.11.1-1-ARCH #1 SMP PREEMPT Sat Sep 14 20:31:35 CEST 2013 i686 GNU/Linux

vifm 0.7.5-1

ncurses 5.9-5

gtk2 2.24.20-1

xterm 297-1

rxvt-unicode 9.18-7

tmux 1.8-1

vifm crashes on launch from the terminal with the following message:

vifm: color_manager.c:47: colmgr_init: Assertion `(color_pair_map != ((void *)0) || avail_pairs == 0) && “Not enough memory.”‘ failed. Aborted (core dumped)

turns out that the crash is being caused by the TERM environment variable being set to xterm-256color or screen-256color in xterm or xterm/tmux combo. in rxvt-unicode the TERM was set to rxvt-unicode-256color. setting TERM to xterm or screen is not causing any crash.

i have been bugged by this for some time now but only found out the reason today.

do not read the startup file ~/.gnuplot

gnuplot reads the ~/.gnuplot file on startup. one can define useful macros or variables in this file so that they are readily available to the user. today i wanted to start gnuplot and bypass the definitions contained in the start up file. one obvious way it to just rename the start up file but that seemed like a bit of a hassle so this is what i came up with after reading the help section for startup

$HOME='' gnuplot

that is just set the HOME environment variable to an empty string before invoking gnuplot. This works because gnuplot looks for the file .gnuplot in the directory defined by the environment variable HOME which is unset just before invoking gnuplot. please note that unsetting HOME this way is only temporary and lasts as long as the gnuplot session. to find out more about startup use the following within a gnuplot session

gnuplot> help startup

$ gnuplot –version
gnuplot 4.6 patchlevel 1

$ bash –version
GNU bash, version 4.2.42(2)-release (i686-pc-linux-gnu)