-
Notifications
You must be signed in to change notification settings - Fork 0
/
search.xml
3393 lines (3176 loc) · 723 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title><![CDATA[两天研习Python基础]]></title>
<url>/2018/02/26/python-basics/</url>
<content type="html"><![CDATA[<h1 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h1><p><strong>两天研习Python基础</strong>系列文章为“learn by example”编程课程的python部分,原英文Github仓库<a href="https://github.com/learnbyexample/Python_Basics" target="_blank" rel="noopener">点击此处</a>,中文Github仓库<a href="https://github.com/ShixiangWang/Python_Basics" target="_blank" rel="noopener">点击此处</a>,所有内容已发至简书(见章节部分)。</p>
<div class="github-widget" data-repo="ShixiangWang/Python_Basics"></div>
<a id="more"></a>
<p>离线学习请克隆:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git clone https://github.com/ShixiangWang/Python_Basics</span><br></pre></td></tr></table></figure>
<p>该系列仅作学习及参考使用,本人能力有限,很多专业术语在学习中,如果错误,还请指正。欢迎大家对仓库fork进行学习、补充和修改等等。</p>
<h1 id="Python-基础"><a href="#Python-基础" class="headerlink" title="Python 基础"></a>Python 基础</h1><p>Python介绍 - 语法、与shell命令工作、文件、文本处理等等…</p>
<ul>
<li>适合Python初学者一两天研习</li>
<li>更加完整的<a href="https://github.com/ShixiangWang/scripting_course/blob/master/Python_curated_resources.md" target="_blank" rel="noopener">Python整合资源列表</a> 包括初学者教程</li>
<li>更多相关资源,访问<a href="https://github.com/ShixiangWang/scripting_course" target="_blank" rel="noopener">scripting course</a></li>
</ul>
<p><br></p>
<h1 id="章节"><a href="#章节" class="headerlink" title="章节"></a>章节</h1><ul>
<li><a href="https://www.jianshu.com/p/043b22c53464" target="_blank" rel="noopener">介绍</a><ul>
<li>安装、Hello World示例、Python解释器、Python标准库</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/cea74af54c19" target="_blank" rel="noopener">数值和字符串数据类型</a><ul>
<li>数值、字符串、常量和内置操作符</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/fc17f347094e" target="_blank" rel="noopener">函数</a><ul>
<li>def、print函数,range函数, type函数,变量作用域</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/01fff554603e" target="_blank" rel="noopener">获取用户输入</a><ul>
<li>整数输入、浮点输入、字符串输入</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/982ce84fe274" target="_blank" rel="noopener">执行外部命令</a><ul>
<li>调用Shell命令、用扩展调用Shell命令、获取命令输出和重定向</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/968e00a326e5" target="_blank" rel="noopener">控制结构</a><ul>
<li>条件检查, if, for, while, continue and break</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/1e2d44b8d060" target="_blank" rel="noopener">列表</a><ul>
<li>列表变量赋值、切片和修改列表、复制列表、列表方法和杂项、循环、列表推导式、获取列表作为用户输入、随机从列表中获取元素</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/d27050ef63f6" target="_blank" rel="noopener">序列、集合以及字典数据类型</a><ul>
<li>字符串、元组、集合、字典</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/c1f7abc7371f" target="_blank" rel="noopener">文本处理</a><ul>
<li>字符串方法、正则表达式、模式匹配和提取、搜索和替换、编译正则表达式、正则表达式进一步阅读</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/294031b710c0" target="_blank" rel="noopener">文件处理</a><ul>
<li>open函数、读入文件,写入文件</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/a66b046d0c4a" target="_blank" rel="noopener">命令行参数</a><ul>
<li>已知参数数目、变长参数、在代码中使用程序名、命令行开关</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/571bab3f422e" target="_blank" rel="noopener">意外处理和调试</a><ul>
<li>意外处理、语法检查、pdb、导入程序</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/62a7bdbf27ba" target="_blank" rel="noopener">文档字符</a><ul>
<li>风格指南,回文示例</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/c3ecc57adaf9" target="_blank" rel="noopener">测试</a><ul>
<li>assert语句、使用assert测试程序、使用unittest框架,使用unittest.mock、使用unittest.mock测试用户输入和程序输出、其他测试框架</li>
</ul>
</li>
<li><a href="https://www.jianshu.com/p/d8a4fd109b30" target="_blank" rel="noopener">练习</a></li>
<li><a href="https://www.jianshu.com/p/188366d88aa5" target="_blank" rel="noopener">进一步阅读</a><ul>
<li>没有涉及的标准主题,有用的编程链接,python扩展包</li>
</ul>
</li>
</ul>
<p><br></p>
<h1 id="电子书"><a href="#电子书" class="headerlink" title="电子书"></a>电子书</h1><ul>
<li>以电子书方式查看<a href="https://learnbyexample.gitbooks.io/python-basics/content/index.html" target="_blank" rel="noopener">gitbook</a></li>
<li>下载离线阅读 - <a href="https://www.gitbook.com/book/learnbyexample/python-basics/details" target="_blank" rel="noopener">链接</a></li>
</ul>
<p><br></p>
<h1 id="致谢"><a href="#致谢" class="headerlink" title="致谢"></a>致谢</h1><ul>
<li><a href="https://automatetheboringstuff.com/" target="_blank" rel="noopener">automatetheboringstuff</a> - 让我入门python</li>
<li><a href="https://www.reddit.com/r/learnpython/" target="_blank" rel="noopener">/r/learnpython/</a> - 帮助初学者和高手的有用论坛</li>
<li><a href="http://devup.in/" target="_blank" rel="noopener">Devs and Hackers</a> - helpful slack group</li>
<li><a href="https://www.reddit.com/r/india/search?q=Weekly+Coders%2C+Hackers+%26+All+Tech+related+thread+author%3Aavinassh&restrict_sr=on&sort=new&t=all" target="_blank" rel="noopener">Weekly Coders, Hackers & All Tech related thread</a> - 谢谢建议和评论</li>
</ul>
<p><br></p>
<h1 id="许可证"><a href="#许可证" class="headerlink" title="许可证"></a>许可证</h1><p>本工作基于<a href="https://creativecommons.org/licenses/by-nc-sa/4.0/" target="_blank" rel="noopener">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a></p>
]]></content>
<categories>
<category> Python之歌 </category>
</categories>
<tags>
<tag> python </tag>
</tags>
</entry>
<entry>
<title><![CDATA[为markdown博文创建独立图片路径]]></title>
<url>/2018/02/12/how-to-create-an-independent-figpath/</url>
<content type="html"><![CDATA[<p>在博文<a href="https://shixiangwang.github.io/2018/02/06/how-to-write-rmd-documents-in-hexo-system/">怎么在hexo博客系统中用Rmarkdown写文章</a>中我介绍了如何使用我创建的函数自动生成<code>Rmarkdown</code>文档并将其转换为<code>markdown</code>博文。文章并没有具体讲生成图片的过程,我在前一篇文章<a href="https://shixiangwang.github.io/2018/02/07/how-to-do-group-survival-analysis/">怎么对连续变量分组并进行生存分析</a>写作时发现用hexo系统存在一些小问题:<code>hexo</code>生成<code>public</code>静态网页文档集合(包括主页展示的所有内容)和博文所在的<code>source</code>文件夹是相对独立的,这会导致<code>markdown</code>的图片引用路径时在本地用一些markdown预览器可以看到图片,但实际呢,在部署的博客上却看不到了!</p>
<a id="more"></a>
<p>根据hexo官方文档<a href="https://hexo.io/zh-cn/docs/asset-folders.html" target="_blank" rel="noopener">https://hexo.io/zh-cn/docs/asset-folders.html</a>的描述,我们可以<strong>将图片扔到<code>source/images</code>文件夹下,然后通过类似于 ![](/images/image.jpg) 的方法访问它们</strong>。值得注意的是,这种方法使用的既不是图片的绝对路径也不是图片的相对路径,所以它会出现了一个略微尴尬的情况,在本地不能预览,在部署好的博客上却能看到!上一篇博文写作时正准备回家,没时间整这个幺蛾子,就是这样发的文章。</p>
<p>当然官网提到可以使用一些非<code>markdown</code>标记符来引用图片,这种我自认为不可取,我看重的就是<code>markdown</code>的简约、文章易迁移特性,使用这种方式会让我的文章<strong>不够自由</strong>。</p>
<p>仔细阅读文档后,发现可取的办法是:</p>
<blockquote>
<p>对于那些想要更有规律地提供图片和其他资源以及想要将他们的资源分布在各个文章上的人来说,Hexo也提供了更组织化的方式来管理资源。这个稍微有些复杂但是管理资源非常方便的功能可以通过将<code>config.yml</code>文件中的<code>post_asset_folder</code>选项设为<code>true</code>来打开。<br> _config.yml<br> post_asset_folder: true<br>当资源文件管理功能打开后,Hexo将会在你每一次通过<code>hexo new [layout] <title></code>命令创建新文章时自动创建一个文件夹。这个资源文件夹将会有与这个 markdown 文件一样的名字。将所有与你的文章有关的资源放在这个关联文件夹中之后,你可以通过相对路径来引用它们,这样你就得到了一个更简单而且方便得多的工作流。</p>
</blockquote>
<p>我使用这种方法新建了一个文档,顺便用<code>git</code>监控哪些文档发生了改变,发现这个效果开启其实就是在<code>source/_post</code>目录下新建一个跟新建的<code>markdown</code>博文(去掉后缀)同名的文件夹。那么问题的解决就比较简单了,我只需要设定好<code>knit</code>将<code>rmd</code>文档转换为<code>md</code>文档时,产生图片的输出路径即可,也就是设定<code>opts_chunk$set(fig.path="../_posts/2018-02-12-how-to-create-an-independent-figpath/")</code>选项。这又可以通过两种方式实现,一是在写作使用的<code>rmarkdown</code>模板文件中直接修改此句,每次以手动的方式设定路径;二是将写好的<code>rmarkdown</code>文章读入R,利用正则表达式抓取该部分的值,然后修改选项为每篇文章对应的文件夹名。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line">new_md_post <- <span class="keyword">function</span>(template_name=<span class="string">"template.Rmd"</span>,post_name=<span class="literal">NULL</span>,template_path=getwd(),</span><br><span class="line"> post_path=<span class="string">"../_posts"</span>,time_tag=<span class="literal">FALSE</span>, new_fig_path=<span class="literal">TRUE</span>){</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span>(is.null(post_name)){</span><br><span class="line"> post_name <- gsub(pattern = <span class="string">"^(.*)\\.[Rr]md$"</span>, <span class="string">"\\1"</span>, x = template_name)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> input_file <- paste(template_path,template_name, sep=<span class="string">"/"</span>)</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span>(time_tag){</span><br><span class="line"> current_time <- Sys.Date()</span><br><span class="line"> out_file <- paste0(post_path, <span class="string">"/"</span>, current_time, <span class="string">"-"</span>, post_name,<span class="string">".md"</span>)</span><br><span class="line"> out_dir <- paste0(post_path, <span class="string">"/"</span>, current_time, <span class="string">"-"</span>, post_name)</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> out_file <- paste0(post_path, <span class="string">"/"</span>, post_name,<span class="string">".md"</span>)</span><br><span class="line"> out_dir <- paste0(post_path, <span class="string">"/"</span>, post_name)</span><br><span class="line"> }</span><br><span class="line"> <span class="comment"># if new_fig_path is FALSE, use united figure path to sotre figures</span></span><br><span class="line"> <span class="comment"># if this variable is TRUE, create an independent directory for post</span></span><br><span class="line"> <span class="keyword">if</span> (new_fig_path){</span><br><span class="line"> dir.create(out_dir, showWarnings = <span class="literal">TRUE</span>, recursive = <span class="literal">TRUE</span>)</span><br><span class="line"> <span class="comment"># add fig.path option to Rmd file</span></span><br><span class="line"> fl_content <- readLines(input_file)</span><br><span class="line"> new_content <- sub(pattern = <span class="string">"fig.path=\".*\""</span>,</span><br><span class="line"> replacement = paste0(<span class="string">"fig.path=\""</span>, out_dir, <span class="string">"/\""</span>), fl_content)</span><br><span class="line"> writeLines(new_content, input_file)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> knitr::knit(input = input_file, output = out_file)</span><br><span class="line"> print(<span class="string">"New markdown post creat successfully!"</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>写作这篇文章其实既用于记录,也用于测试该功能。我们就用最经典的数据集随手画个图好了~</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">plot(mtcars)</span><br></pre></td></tr></table></figure>
<p><img src="test-fig-path-1.png" alt="测试图片"></p>
<p>有意思的是,测试后发现<code>hexo</code>多此一举般,如果将图片扔进文章对应的目录下,只需要填入图片名字即可,相对和绝对路径引用还是不能见效。至少比之前方便了,罢了罢了……</p>
]]></content>
<categories>
<category> 极客R </category>
</categories>
<tags>
<tag> R </tag>
<tag> Rmarkdown </tag>
<tag> markdown </tag>
</tags>
</entry>
<entry>
<title><![CDATA[简单理解lapply,sapply,vapply]]></title>
<url>/2018/02/10/easy-sapply-apply-vapply/</url>
<content type="html"><![CDATA[<p>在我之前转载的文章<a href="https://www.jianshu.com/p/9bca3555b06c" target="_blank" rel="noopener">apply,lapply,sapply用法探索</a>中已经对R中<code>apply</code>家族函数进行了比较详细地说明,这篇文章基于我在data campus中对<code>lapply</code>、<code>sapply</code>、<code>vapply</code>几个函数的学习,以更为简单的实例来了解这几个以列表对输入的迭代函数。</p>
<a id="more"></a>
<p>使用的是一组温度数据,每天测5次,连续测量一个星期。</p>
<p>我们先将其输入R:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">temp <- list(c(<span class="number">3</span>, <span class="number">7</span>, <span class="number">9</span>, <span class="number">6</span>, -<span class="number">1</span>), c(<span class="number">6</span>, <span class="number">9</span>, <span class="number">12</span>, <span class="number">13</span>, <span class="number">5</span>), c(<span class="number">4</span>, <span class="number">8</span>, <span class="number">3</span>, -<span class="number">1</span>, -<span class="number">3</span></span><br><span class="line">), c(<span class="number">1</span>, <span class="number">4</span>, <span class="number">7</span>, <span class="number">2</span>, -<span class="number">2</span>), c(<span class="number">5</span>, <span class="number">7</span>, <span class="number">9</span>, <span class="number">4</span>, <span class="number">2</span>), c(-<span class="number">3</span>, <span class="number">5</span>, <span class="number">8</span>, <span class="number">9</span>, <span class="number">4</span>), c(<span class="number">3</span>,</span><br><span class="line"> <span class="number">6</span>, <span class="number">9</span>, <span class="number">4</span>, <span class="number">1</span>))</span><br></pre></td></tr></table></figure>
<p>我们进行迭代计算的函数是<code>basics</code>,它计算每一天温度的最小、最大值、平均值以及中位数。</p>
<figure class="highlight livecodeserver"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Definition of the basics() function</span></span><br><span class="line">basics <- <span class="function"><span class="keyword">function</span>(<span class="title">x</span>) {</span></span><br><span class="line"> c(<span class="built_in">min</span> = <span class="built_in">min</span>(x), mean = mean(x), <span class="built_in">median</span> = <span class="built_in">median</span>(x), <span class="built_in">max</span> = <span class="built_in">max</span>(x))</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p><code>lapply</code>最为常见,以列表为输入,以列表为输出。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">> lapply(temp, basics)</span><br><span class="line">[[<span class="number">1</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> -<span class="number">1.0</span> <span class="number">4.8</span> <span class="number">6.0</span> <span class="number">9.0</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">2</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> <span class="number">5</span> <span class="number">9</span> <span class="number">9</span> <span class="number">13</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">3</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> -<span class="number">3.0</span> <span class="number">2.2</span> <span class="number">3.0</span> <span class="number">8.0</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">4</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> -<span class="number">2.0</span> <span class="number">2.4</span> <span class="number">2.0</span> <span class="number">7.0</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">5</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> <span class="number">2.0</span> <span class="number">5.4</span> <span class="number">5.0</span> <span class="number">9.0</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">6</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> -<span class="number">3.0</span> <span class="number">4.6</span> <span class="number">5.0</span> <span class="number">9.0</span></span><br><span class="line"></span><br><span class="line">[[<span class="number">7</span>]]</span><br><span class="line"> min mean median max</span><br><span class="line"> <span class="number">1.0</span> <span class="number">4.6</span> <span class="number">4.0</span> <span class="number">9.0</span></span><br></pre></td></tr></table></figure>
<p>可以看出,如果迭代次数够大,结果会非常冗长,但我们所需要的结果其实可以以比较紧凑的数组(矩阵)展示出来。因此,我们可以使用<code>sapply</code>函数,<code>s</code>前缀即简化之意。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">> sapply(temp, basics)</span><br><span class="line"> [,<span class="number">1</span>] [,<span class="number">2</span>] [,<span class="number">3</span>] [,<span class="number">4</span>] [,<span class="number">5</span>] [,<span class="number">6</span>] [,<span class="number">7</span>]</span><br><span class="line">min -<span class="number">1.0</span> <span class="number">5</span> -<span class="number">3.0</span> -<span class="number">2.0</span> <span class="number">2.0</span> -<span class="number">3.0</span> <span class="number">1.0</span></span><br><span class="line">mean <span class="number">4.8</span> <span class="number">9</span> <span class="number">2.2</span> <span class="number">2.4</span> <span class="number">5.4</span> <span class="number">4.6</span> <span class="number">4.6</span></span><br><span class="line">median <span class="number">6.0</span> <span class="number">9</span> <span class="number">3.0</span> <span class="number">2.0</span> <span class="number">5.0</span> <span class="number">5.0</span> <span class="number">4.0</span></span><br><span class="line">max <span class="number">9.0</span> <span class="number">13</span> <span class="number">8.0</span> <span class="number">7.0</span> <span class="number">9.0</span> <span class="number">9.0</span> <span class="number">9.0</span></span><br></pre></td></tr></table></figure>
<p>是否已经足够紧凑?</p>
<p>最后想要介绍的函数<code>vapply</code>其实是为<code>sapply</code>加了一层验证选项:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">> vapply(temp, basics, numeric(<span class="number">4</span>))</span><br><span class="line"> [,<span class="number">1</span>] [,<span class="number">2</span>] [,<span class="number">3</span>] [,<span class="number">4</span>] [,<span class="number">5</span>] [,<span class="number">6</span>] [,<span class="number">7</span>]</span><br><span class="line">min -<span class="number">1.0</span> <span class="number">5</span> -<span class="number">3.0</span> -<span class="number">2.0</span> <span class="number">2.0</span> -<span class="number">3.0</span> <span class="number">1.0</span></span><br><span class="line">mean <span class="number">4.8</span> <span class="number">9</span> <span class="number">2.2</span> <span class="number">2.4</span> <span class="number">5.4</span> <span class="number">4.6</span> <span class="number">4.6</span></span><br><span class="line">median <span class="number">6.0</span> <span class="number">9</span> <span class="number">3.0</span> <span class="number">2.0</span> <span class="number">5.0</span> <span class="number">5.0</span> <span class="number">4.0</span></span><br><span class="line">max <span class="number">9.0</span> <span class="number">13</span> <span class="number">8.0</span> <span class="number">7.0</span> <span class="number">9.0</span> <span class="number">9.0</span> <span class="number">9.0</span></span><br><span class="line">> vapply(temp, basics, numeric(<span class="number">3</span>))</span><br><span class="line">Error <span class="keyword">in</span> vapply(temp, basics, numeric(<span class="number">3</span>)) : 值的长度必需为<span class="number">3</span>,</span><br><span class="line"> 但FUN(X[[<span class="number">1</span>]])结果的长度却是<span class="number">4</span></span><br></pre></td></tr></table></figure>
<p> 读者可以发现,当第三个参数其实就是验证选项,命名为<code>FUN.VALUE</code>。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"> > args(vapply)</span><br><span class="line"><span class="keyword">function</span> (X, FUN, FUN.VALUE, <span class="keyword">...</span>, USE.NAMES = <span class="literal">TRUE</span>)</span><br><span class="line"><span class="literal">NULL</span></span><br></pre></td></tr></table></figure>
<p>我们知道每次迭代计算应该返回4个数值型结果,所以当我们设置为<code>numeric(3)</code>时它会报错。这个函数及其选项的设定在我们编写比较大型的迭代计算和整合函数代码时非常有用,可以帮助我们快速检验结果的有效性,尽量避免调试bug带来的苦恼。</p>
]]></content>
<categories>
<category> 极客R </category>
</categories>
<tags>
<tag> R </tag>
<tag> lapply </tag>
<tag> sapply </tag>
<tag> vapply </tag>
<tag> 迭代计算 </tag>
</tags>
</entry>
<entry>
<title><![CDATA[怎么对连续变量分组并进行生存分析]]></title>
<url>/2018/02/07/how-to-do-group-survival-analysis/</url>
<content type="html"><![CDATA[<p>在探究基因表达、基因拷贝数等连续变量对癌症病人的预后情况的影响时,我不得不面对和处理的主要问题是如何对这种连续型的变量进行分组,然后进行相应的生存分析。</p>
<a id="more"></a>
<p>做科研分析的朋友可能都比较了解,针对变量数值分组,一般是采用中位数、四分位数或者均值这些基本描述统计量。如果更细致地,可以按百分比,例如Top/Bottom 5%啊,10%啊之类的进行划分。</p>
<p>我们先来看怎么实现,然后再谈谈我自己的理解和评价。</p>
<p>生存分析<strong>最最关键</strong>的两个变量是生存事件和存活时间,前者是指一位病患是死了还是不知道是死是活了,前者一般用1表示,后者用0,其中后者常被称为截尾事件,要么就是研究周期到了,病人还没死;要么是找不到人了。我这里不是在侃概念,述说得也并不一定精准,详细了解就找谷歌度娘,我不再赘述。</p>
<p>科研分析的目的大抵都可以归根到找差异,你搞出来的跟别人搞出来的不一样,你就有话语权了,可以发文章。所以生存分析第三个必不可少的变量是<strong>组别</strong>变量,用来对比和探寻差异。</p>
<p>有的时候组别不明自显,比如我们要分析某个癌症组织和正常组织的差异,那么划分组别的方式自然就很明显了,而且在实验或分析设计之时就能确定。这种数据用来进行生存分析是最简单的,标准的代码一套,看结果就可以了。</p>
<p>如果你是想进行这样的分析,百度一下相信有不少博文可以解决你的这个问题。用R来做,不外乎以下几步:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 载入分析和画图包</span></span><br><span class="line"><span class="keyword">library</span>(survival)</span><br><span class="line"><span class="keyword">library</span>(survminer)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 读入数据</span></span><br><span class="line">df <- read.table(<span class="keyword">...</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建生存模型</span></span><br><span class="line">sfit <- survfit(Surv(time, event) ~ group, data=df)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># 绘图</span></span><br><span class="line">ggsurvplot(</span><br><span class="line"> sfit, <span class="comment"># survfit object with calculated statistics.</span></span><br><span class="line"> data = df, <span class="comment"># data used to fit survival curves.</span></span><br><span class="line"> risk.table = <span class="literal">FALSE</span>, <span class="comment"># show risk table.</span></span><br><span class="line"> pval = <span class="literal">TRUE</span>, <span class="comment"># show p-value of log-rank test.</span></span><br><span class="line"> conf.int = <span class="literal">FALSE</span>, <span class="comment"># show confidence intervals for</span></span><br><span class="line"> <span class="comment"># point estimates of survival curves.</span></span><br><span class="line"> palette = c(<span class="string">"red"</span>, <span class="string">"blue"</span>),</span><br><span class="line"> <span class="keyword">...</span>)</span><br></pre></td></tr></table></figure>
<p>这里画图函数涉及一些参数的设定,可以参考<a href="https://www.jianshu.com/p/2da8eb255596" target="_blank" rel="noopener">怎么画出好看的生存曲线</a>这篇文章。</p>
<p>如果我们想要将连续型变量进行生存对比分析,显然我们要在构建生存模型之前将组别划分好。</p>
<p>这样的问题是最让人讨厌却又难以避而不见的,像基因表达对预后的影响就以这样的问题呈现出来,做过几次之后我对这种频繁改动组别设定的操作感到厌烦。</p>
<p>为了提升操作的效率,我花时间将分组和画图两个过程都写成了函数的形式,放在<a href="https://gist.github.com/ShixiangWang/75ae36de5d1b42c3d4de79986d03e16b" target="_blank" rel="noopener">Gist</a>上,有需要的可以下载使用。</p>
<p>第一个分组函数尽量不要改动,第二个画图函数涉及比较多的参数设定,使用时自由度更高,可以根据自己的需要进行修改。</p>
<p>我们先查看我载入的样例数据:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">head(df)</span><br></pre></td></tr></table></figure>
<figure class="highlight lsl"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">## samples expression OS OS_IND</span><br><span class="line">## <span class="number">1</span> TCGA<span class="number">-05</span><span class="number">-4249</span><span class="number">-01</span> <span class="number">9.53</span> <span class="number">1523</span> <span class="number">0</span></span><br><span class="line">## <span class="number">2</span> TCGA<span class="number">-05</span><span class="number">-4250</span><span class="number">-01</span> <span class="number">9.46</span> <span class="number">121</span> <span class="number">1</span></span><br><span class="line">## <span class="number">3</span> TCGA<span class="number">-05</span><span class="number">-4382</span><span class="number">-01</span> <span class="number">9.56</span> <span class="number">607</span> <span class="number">0</span></span><br><span class="line">## <span class="number">4</span> TCGA<span class="number">-05</span><span class="number">-4384</span><span class="number">-01</span> <span class="number">11.04</span> <span class="number">426</span> <span class="number">0</span></span><br><span class="line">## <span class="number">5</span> TCGA<span class="number">-05</span><span class="number">-4389</span><span class="number">-01</span> <span class="number">10.45</span> <span class="number">1369</span> <span class="number">0</span></span><br><span class="line">## <span class="number">6</span> TCGA<span class="number">-05</span><span class="number">-4390</span><span class="number">-01</span> <span class="number">10.90</span> <span class="number">1126</span> <span class="number">0</span></span><br></pre></td></tr></table></figure>
<p>载入写好的函数脚本:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">source</span>(<span class="string">"~/Desktop/groupSurvival.R"</span>)</span><br><span class="line"></span><br><span class="line">args(groupSurvival)</span><br></pre></td></tr></table></figure>
<figure class="highlight clean"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">## function (df, event = <span class="string">"OS_IND"</span>, time = <span class="string">"OS"</span>, var = NULL, time.limit = NULL,</span><br><span class="line">## interval = c(<span class="string">"open"</span>, <span class="string">"close"</span>), method = c(<span class="string">"quartile"</span>, <span class="string">"mean"</span>,</span><br><span class="line">## <span class="string">"median"</span>, <span class="string">"percent"</span>, <span class="string">"custom"</span>), percent = NULL, step = <span class="number">20</span>,</span><br><span class="line">## custom_fun = NULL, group1 = <span class="string">"High"</span>, group2 = <span class="string">"Low"</span>)</span><br><span class="line">## NULL</span><br></pre></td></tr></table></figure>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">args(plot_surv)</span><br></pre></td></tr></table></figure>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"><span class="comment"># function (os_mat, cutoff = NULL, pval = TRUE, ...)</span></span></span><br><span class="line"><span class="meta">#</span><span class="bash"><span class="comment"># NULL</span></span></span><br></pre></td></tr></table></figure>
<p>最重要的<code>groupSurvival</code>函数,一系列的参数都有含义,包括指定最重要的三个变量,设定分组的方法,组名,甚至我还在内部写了一个函数去根据步长计算对应的p值(最小p值和对应的时间会返回为结果列表的一部分)。</p>
<p>使用函数对基因表达进行分组,分组方式是<code>median</code>中位数。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">res <- groupSurvival(df=df, event=<span class="string">"OS_IND"</span>, time=<span class="string">"OS"</span>, var=<span class="string">"expression"</span>,method=<span class="string">"median"</span>)</span><br></pre></td></tr></table></figure>
<p>画图看看:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">plot_surv(res$data)</span><br></pre></td></tr></table></figure>
<p><img src="/images/plot_surv-1.png" alt="plot of chunk plot_surv"></p>
<p>设置一个时间阈值:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">plot_surv(res$data, cutoff = <span class="number">3000</span>)</span><br></pre></td></tr></table></figure>
<p><img src="/images/plot_surv2-1.png" alt="plot of chunk plot_surv2"></p>
<p>使用百分比(上下百分之多少),并确定使用的比例(1表示100%)分组并进行绘图。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">res <- groupSurvival(df=df, event=<span class="string">"OS_IND"</span>, time=<span class="string">"OS"</span>, var=<span class="string">"expression"</span>,method=<span class="string">"percent"</span>, percent = <span class="number">0.1</span>)</span><br><span class="line"></span><br><span class="line">plot_surv(res$data, cutoff = <span class="number">3000</span>)</span><br></pre></td></tr></table></figure>
<p><img src="/images/grouping_and_plot-1.png" alt="plot of chunk grouping_and_plot"></p>
<p>如果你有一些R的编程基础,完全可以基于这两个函数将所有的方法算一遍,然后再去查看结果,确定合适的分组方式。</p>
<hr>
<p>最后,我们到底应该根据结果选择方法、还是选择方法之后就认定了结果,这是悬在这类分析中的一把利剑。所谓的<strong>差异</strong>到底是什么?我们在进行分析时需要有自己的道德和专业两重标准。</p>
<p>无论大家是否有共识,做好自己足矣。</p>
]]></content>
<categories>
<category> 极客R </category>
</categories>
<tags>
<tag> R </tag>
<tag> 生存分析 </tag>
<tag> survival </tag>
<tag> survminer </tag>
</tags>
</entry>
<entry>
<title><![CDATA[怎么在hexo博客系统中用Rmarkdown写文章]]></title>
<url>/2018/02/06/how-to-write-rmd-documents-in-hexo-system/</url>
<content type="html"><![CDATA[<p>作为一个搞数据分析的,动笔总少不了图和代码,<code>markdown</code>对代码本身就有不错的区分和高亮支持,但你永远不可能用<code>markdown</code>写出图来啊!这类东东人们称之为静态的。为了解决这类问题,现在有两个非常流行的动态文档工具,一是<code>Python</code>中的<a href="http://jupyter.org/" target="_blank" rel="noopener"><code>Jupyter Notebook</code></a>,另外就是<code>Rstudio</code>公司开发的 <a href="http://rmarkdown.rstudio.com/rmarkdown_websites.html#overview" target="_blank" rel="noopener"><code>Rmarkdown</code></a>了。</p>
<a id="more"></a>
<p>就流行度来说,<code>Jupyter Notebook</code>是胜<code>Rmarkdown</code>良多的,前者不仅支持<code>Python</code>本身,而且支持<code>R</code>,不仅如此,它还扩展到其他一些语言中去了(当然<code>Rmarkdown</code>现在也能调用<code>Python</code>和<code>Shell</code>等语言代码)。但我不知道什么原因,<code>Jupyter Notebook</code>导出的<code>markdown</code>文档实在丑的很,虽然<code>Github</code>很神奇的能够全部进行识别并显示出应有的效果。另外,Notebook用来调用写博客并不方便。<code>Rmarkdown</code>直接用来写博客则方便得多,语法跟<code>markdown</code>差不多,又能利用谢益辉大大的<code>knitr</code>包将其转换为<code>markdown</code>文档。</p>
<p>总之,作为一个R爱好者,当然用R搞事情。</p>
<p>谢益辉已经开发了一个叫<code>blogdown</code>的包,专门用<code>Rmarkdown</code>部署生成博客,支持<code>hugo</code>所有的主题。如果读者还没有自己的博客,又使用R,推荐阅读<a href="https://bookdown.org/yihui/blogdown/" target="_blank" rel="noopener">https://bookdown.org/yihui/blogdown/</a>创建自己的博客并用<code>Rmarkdown</code>发布文章。如果你喜欢使用<code>hexo</code>博客系统,并想要使用<code>Rmarkdown</code>写文章,下面就是你需要阅读的干货了,当然我这种方法不仅限于用在<code>hexo</code>博客系统。</p>
<p>我实际做的事情就是写了两个<code>R</code>的函数,可以通过调用的方式创建<code>Rmarkdown</code>文档,并利用<code>knitr</code>包的<code>knit</code>函数将其转换为<code>markdown</code>文档。</p>
<p>需要文档代已经提交到Gist上面,可以<a href="https://gist.github.com/ShixiangWang/197cbe60c6fa096888b701af72511740" target="_blank" rel="noopener">点击查看和下载</a>。</p>
<h2 id="第一步"><a href="#第一步" class="headerlink" title="第一步"></a>第一步</h2><p>创建一个<code>Rmarkdown</code>文档模板,这样我们可以非常方便地在每次写新文章时生成YAML头信息。</p>
<p>其内容如下,简单设定标题、作者、日期、目录、标签,你可以根据自己情进行更改:</p>
<script src="//gist.github.com/197cbe60c6fa096888b701af72511740.js?file=template.Rmd"></script>
<p>上面Gist文档日期使用了<code>R</code>命令,后续我们使用<code>knit</code>可以将其自动转换为<code>markdown</code>文档生成时的日期。</p>
<p>你可以自己设置,只要符合<a href="https://hexo.io/zh-cn/docs/front-matter.html" target="_blank" rel="noopener">头信息规范</a>即可:</p>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">title:</span> <span class="string">"Put your title here"</span></span><br><span class="line"><span class="attr">author:</span> <span class="string">王诗翔</span></span><br><span class="line"><span class="attr">date:</span> <span class="string">"2018-02-06 12:36:48"</span></span><br><span class="line"><span class="attr">top:</span> <span class="literal">false</span></span><br><span class="line"><span class="attr">categories:</span> <span class="string">Linux杂烩</span></span><br><span class="line"><span class="attr">tags:</span></span><br><span class="line"><span class="bullet"> -</span> <span class="string">Linux</span></span><br><span class="line"><span class="meta">---</span></span><br></pre></td></tr></table></figure>
<p>我们将它保存为<code>template.Rmd</code>文件(<code>Rmarkdown</code>文件一般以<code>.Rmd</code>或者<code>.rmd</code>结尾)。</p>
<h2 id="第二步"><a href="#第二步" class="headerlink" title="第二步"></a>第二步</h2><p>将下面两个函数保存到一个R文件(以<code>.R</code>结尾)中:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">################</span></span><br><span class="line"><span class="comment">## 用rmd写博客 ##</span></span><br><span class="line"><span class="comment">################</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 作者:王诗翔</span></span><br><span class="line"><span class="comment"># 更新日期:2018-02-05</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">#>>>>>> new_rmd_post 函数 <<<<<<<<<<</span></span><br><span class="line"><span class="comment"># 写好模板文档后,你可以用这个函数来创建Rmarkdown文档</span></span><br><span class="line"><span class="comment"># 参数说明:</span></span><br><span class="line"><span class="comment"># post_name: 博文名(最好英文,显示不会乱码),比如我写这篇博文用的是</span></span><br><span class="line"><span class="comment"># how-to-write-rmd-documents-in-hexo-system</span></span><br><span class="line"><span class="comment"># 意思不一定要对,只要能跟其他博文名字有区分就行了</span></span><br><span class="line"><span class="comment"># template_name: 模板名,起template.Rmd最好,因为每次写文章都会用到,</span></span><br><span class="line"><span class="comment"># 这样你创建的时候不用每次都指定模板的名字</span></span><br><span class="line"><span class="comment"># template_path: 模板文档的路径,默认当前工作路径</span></span><br><span class="line"><span class="comment"># post_path: 你想把生成的文档放在哪个路径,默认当前工作路径</span></span><br><span class="line">new_rmd_post <- <span class="keyword">function</span>(post_name=<span class="literal">NULL</span>,template_name=<span class="string">"template.Rmd"</span>,</span><br><span class="line"> template_path=getwd(), post_path=getwd()){</span><br><span class="line"> <span class="keyword">if</span>(is.null(post_name)){</span><br><span class="line"> <span class="keyword">stop</span>(<span class="string">"A post name must be given!"</span>)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> input_file <- paste(template_path,template_name, sep=<span class="string">"/"</span>)</span><br><span class="line"> current_time <- Sys.Date()</span><br><span class="line"> out_file <- paste0(current_time, <span class="string">"-"</span>,post_name,<span class="string">".Rmd"</span>)</span><br><span class="line"> fl_content <- readLines(input_file)</span><br><span class="line"> writeLines(fl_content, out_file)</span><br><span class="line"> print(<span class="string">"New Rmarkdown post creat successfully!"</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">#>>>>> new_md_post 函数 <<<<<<<<<<</span></span><br><span class="line"><span class="comment"># 你可以用这个函数来将Rmd文档转换为markdown文档</span></span><br><span class="line"><span class="comment"># 需要安装knitr包,命令为 install.packages("knitr")</span></span><br><span class="line"><span class="comment"># 参数说明:</span></span><br><span class="line"><span class="comment"># post_name: 文章文档名,推荐使用 年-月-日-英文名 的方式</span></span><br><span class="line"><span class="comment"># template_name: 模板名,你需要转换的Rmd文档</span></span><br><span class="line"><span class="comment"># template_path: 模板文档的路径,默认当前工作路径</span></span><br><span class="line"><span class="comment"># post_path: 你想把生成的文档放在哪个路径,默认当前工作路径</span></span><br><span class="line"><span class="comment"># time_tag: 时间标签,如果你转换的文档没有年-月-日这种标记,</span></span><br><span class="line"><span class="comment"># 将time_tag设定为TRUE会自动在名字前加上</span></span><br><span class="line">new_md_post <- <span class="keyword">function</span>(post_name=<span class="literal">NULL</span>,template_name=<span class="string">"template.Rmd"</span>,template_path=getwd(),</span><br><span class="line"> post_path=getwd(),time_tag=<span class="literal">FALSE</span>){</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span>(is.null(post_name)){</span><br><span class="line"> post_name <- gsub(pattern = <span class="string">"^(.*)\\.[Rr]md$"</span>, <span class="string">"\\1"</span>, x = template_name)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> input_file <- paste(template_path,template_name, sep=<span class="string">"/"</span>)</span><br><span class="line"> <span class="comment"># retrieve system date</span></span><br><span class="line"> <span class="keyword">if</span>(time_tag){</span><br><span class="line"> current_time <- Sys.Date()</span><br><span class="line"> out_file <- paste0(post_path, <span class="string">"/"</span>, current_time, <span class="string">"-"</span>,post_name,<span class="string">".md"</span>)</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> out_file <- paste0(post_path, <span class="string">"/"</span>, post_name,<span class="string">".md"</span>)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> knitr::knit(input = input_file, output = out_file)</span><br><span class="line"> print(<span class="string">"New markdown post creat successfully!"</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>我把它保存为<code>new_post.R</code>,上述我进行了比较详细的注释,请在使用之前仔细阅读一下。</p>
<h2 id="使用"><a href="#使用" class="headerlink" title="使用"></a>使用</h2><p>我以现在以<code>Rmarkdown</code>写的这篇文章为例,简单讲一下使用。</p>
<p>我推荐在与你<code>markdown</code>博文目录同级创建一个<code>_rmd</code>目录,你可以将该目录设为一个项目目录,专门用来写<code>rmarkdown</code>文档。</p>
<p>或者你每次用<code>setwd()</code>函数设定工作目录。</p>
<p>将前两步创建的两个文件扔到该目录。运行R文件:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">source</span>(<span class="string">"./new_post.R"</span>)</span><br></pre></td></tr></table></figure>
<p>这样就能在R控制台调用里面的两个函数了。</p>
<p>创建一个<code>Rmarkdown</code>文档:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">> new_rmd_post(<span class="string">"how-to-write-rmd-documents-in-hexo-system"</span>)</span><br><span class="line">[<span class="number">1</span>] <span class="string">"New Rmarkdown post creat successfully!"</span></span><br></pre></td></tr></table></figure>
<p>创建的文档名会自动添加年-月-日和后缀。</p>
<p>然后你就可以开始写博客了,写好后将<code>Rmarkdown</code>转换为<code>markdown</code>文档:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">> new_md_post(template_name = <span class="string">"2018-02-05-how-to-write-rmd-documents-in-hexo-system.Rmd"</span>, post_path = <span class="string">"../_posts"</span>)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">processing file: /home/wsx/blog/<span class="keyword">source</span>/_rmd/<span class="number">2018</span>-<span class="number">02</span>-<span class="number">05</span>-how-to-write-rmd-documents-<span class="keyword">in</span>-hexo-system.Rmd</span><br><span class="line"> |.................................................................| <span class="number">100</span>%</span><br><span class="line"> inline R code fragments</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">output file: ../_posts/<span class="number">2018</span>-<span class="number">02</span>-<span class="number">05</span>-how-to-write-rmd-documents-<span class="keyword">in</span>-hexo-system.md</span><br><span class="line"></span><br><span class="line">[<span class="number">1</span>] <span class="string">"New markdown post creat successfully!"</span></span><br></pre></td></tr></table></figure>
<p>细心的娃娃可以感觉到我单独创建<code>_rmd</code>目录并将其设为工作目录的好处:需要键入函数的参数非常少。特别是你固定你自己的写法之后,你将两个函数中的目录路径默认参数全部对应上,再使用<code>R</code>的<code>TAB</code>键补全,运行命令简直秒秒种,专心写文章就好啦。</p>
<p>本文没有涉及到画图,从理论上讲是毫无问题的,因为我只是创建了一个快速的文档创建和转换接口。后续我少不了会用到绘图,到时候再讲。</p>
]]></content>
<categories>
<category> 极客R </category>
</categories>
<tags>
<tag> R </tag>
<tag> 写作 </tag>
<tag> Rmd </tag>
<tag> md </tag>
<tag> Rstudio </tag>
<tag> Gist </tag>
</tags>
</entry>
<entry>
<title><![CDATA[sed如何在执行命令前过滤特定文本行]]></title>
<url>/2018/01/31/sed-how-to-filter-rows-before-using-command/</url>
<content type="html"><![CDATA[<p>有人在微信群里问到这样一个问题:</p>
<blockquote>
<p>请问我想把参考基因组中所有的A和C替换成C和A,小写也按照这样的规则替换,该怎么实现呢,tr可以做到,但是我想保证>后面的染色体名字不会被替换?<br><img src="http://upload-images.jianshu.io/upload_images/3884693-da7fbb654d3f7bc1.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt=""></p>
</blockquote>
<a id="more"></a>
<p><a href="http://man.linuxde.net/tr" target="_blank" rel="noopener"><code>tr</code></a>命令确实可以完成文本字符替换一对一的映射,但很显然,这样的功能想要解决这个问题是不够的,它把不想要改变的文本也改变了。</p>
<p>解决问题的思路在于如何实现带>标志的文本直接输出,而DNA字符行被执行转换命令。这个问题其实使用sed命令就可以快速解决,不少朋友对于<code>sed</code>的使用可能记忆为我在<a href="https://shixiangwang.github.io/2017/09/03/Linux-data-analysis-tools/#用Sed进行流编辑">笔记</a>中写的:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">command/pattern/replacement/flag</span><br></pre></td></tr></table></figure>
<p><code>command</code>是一些命令,比如<code>s</code>,<code>d</code>等,<code>flag</code>是一些标记,<code>i</code>,<code>g</code>等等。</p>
<p>这样认识sed命令其实不是很全面,sed允许指定文本模式来过滤出命令要作用的行,格式如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/pattern/command</span><br></pre></td></tr></table></figure>
<p>所以整理起来,sed格式其实(除了一些选项)为</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/pattern/command/pattern/replacement/flag</span><br></pre></td></tr></table></figure>
<p>第一个<code>pattern</code>用来过滤出要处理的文本行,第二个<code>pattern</code>是sed命令要作用的模式。</p>
<p>那么利用这个特定,一开始的问题可以利用正则表达式非常简单地解决了:</p>
<p>先用<code>/^[^>]/</code>找到不以<code>></code>开始的行,然后执行<code>sed</code>字符转换命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> 随便写一个测试文本</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> cat <span class="built_in">test</span></span></span><br><span class="line"><span class="meta">></span><span class="bash"> CAACCA</span></span><br><span class="line">acgtACGTACTGagct</span><br><span class="line">tacgactNNNNNNNNNNNNNNNN</span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> cat <span class="built_in">test</span> | sed <span class="string">'/^[^>]/y/ACac/CAca/'</span></span></span><br><span class="line"><span class="meta">></span><span class="bash"> CAACCA</span></span><br><span class="line">cagtCAGTCATGcgat</span><br><span class="line">tcagcatNNNNNNNNNNNNNNNN</span><br></pre></td></tr></table></figure>
<hr>
<p>有兴趣的可以看看<a href="https://shixiangwang.github.io/2017/12/25/sed-and-gawk/">初识sed与awk</a>这篇笔记。</p>
]]></content>
<categories>
<category> Linux杂烩 </category>
</categories>
<tags>
<tag> sed </tag>
<tag> 生物信息学 </tag>
<tag> 文本处理 </tag>
<tag> fasta </tag>
</tags>
</entry>
<entry>
<title><![CDATA[Sync deploy 命令工具]]></title>
<url>/2018/01/31/sync-deploy-tools/</url>
<content type="html"><![CDATA[<p>该命令集可以非常方便地向远程主机/服务器上传文件、运行远程脚本、下载文件等。</p>
<div class="github-widget" data-repo="ShixiangWang/sync-deploy"></div>
<a id="more"></a>
<p><strong>目录</strong>:</p>
<ul>
<li><a href="https://github.com/ShixiangWang/sync-deploy#目的" target="_blank" rel="noopener">目的</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#下载与使用" target="_blank" rel="noopener">下载与使用</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#准备与配置" target="_blank" rel="noopener">准备与配置</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#命令说明" target="_blank" rel="noopener">命令说明</a><ul>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-command" target="_blank" rel="noopener">sync-command</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-upload" target="_blank" rel="noopener">sync-upload</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-download" target="_blank" rel="noopener">sync-download</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-run" target="_blank" rel="noopener">sync-run</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-deploy" target="_blank" rel="noopener">sync-deploy</a></li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#sync-check" target="_blank" rel="noopener">sync-check</a></li>
</ul>
</li>
<li><a href="https://github.com/ShixiangWang/sync-deploy#计算操作实例" target="_blank" rel="noopener">计算操作实例</a></li>
</ul>
<h2 id="目的"><a href="#目的" class="headerlink" title="目的"></a>目的</h2><p>交互式地输入ssh、scp命令进行远端主机命令/脚本的执行、文件的上传与下载并不是很方便,有时候频繁地键入<code>hostname@ip</code>也是一件非常痛苦的事情。另外一方面,如果是向计算平台提交任务脚本,在远端文本命令窗口内修改作业参数以及调试运行脚本也是蛮不方便。所以仓库里脚本是为了能够比较方便地执行这一些任务。</p>
<p>命令集内置<code>ssh</code>、<code>scp</code>、<code>qsub</code>、<code>qstat</code>命令,分别用于运行远程脚本、命令、上传/下载文件、提交作业和查看作业状态。</p>
<h2 id="下载与使用"><a href="#下载与使用" class="headerlink" title="下载与使用"></a>下载与使用</h2><p><a href="https://github.com/ShixiangWang/sync-deploy/releases" target="_blank" rel="noopener">点击下载</a></p>
<p>或克隆:</p>
<figure class="highlight crmsh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="keyword">clone</span> <span class="title">https</span>://github.com/ShixiangWang/sync-deploy.git</span><br></pre></td></tr></table></figure>
<p>下载后执行<code>add_path.sh</code>脚本将命令添加到环境路径中,这样无论你处于什么目录都能使用。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cd sync-deploy/src</span><br><span class="line">./add_path.sh</span><br></pre></td></tr></table></figure>
<p>除了<code>sync-command</code>命令没有选项,其他命令基本都有选项需要指定。</p>
<p><strong>对应地,除了<code>sync-command</code>其他命令都有<code>-h</code>选项,你可以获取帮助</strong>。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">sync-upload -h</span><br><span class="line">sync-download -h</span><br><span class="line">sync-run -h</span><br><span class="line">sync-deploy -h</span><br><span class="line">sync-check -h</span><br></pre></td></tr></table></figure>
<h2 id="准备与配置"><a href="#准备与配置" class="headerlink" title="准备与配置"></a>准备与配置</h2><p>首先在服务器端配置本地机器的公钥,以便于实现无密码文件传输。</p>
<p>参考文章<a href="https://www.liaohuqiu.net/cn/posts/ssh-keygen-abc/" target="_blank" rel="noopener">ssh-keygen基本用法</a>或其他资料生成公钥和私钥(搜索引擎可以找到一大堆这样的博文,我就不啰嗦了)。</p>
<p>将公钥<code>id_sra.pub</code>(本地机器.ssh子目录下)中文本内容拷贝到服务器.ssh子目录中的<code>authorized_keys</code>中,放在已有文本后面。如果该文件不存在则创建。</p>
<p>进行测试,如果不需要密码登录则成功。</p>
<p><strong>然后点击打开当前目录(src/)的<code>sync-setting</code>文件,将远程主机的host名与ip地址改为你自己的</strong>。</p>
<hr>
<p><strong>如果你想要在计算平台部署任务</strong>,请点击打开当前目录下的<code>qsub_header</code>文件填入PBS参数,设置可以参考<a href="https://github.com/ShixiangWang/mytoolkit/blob/master/hpc_info.md" target="_blank" rel="noopener">我整理的</a>或者百度上的其他资源,例如<a href="https://wenku.baidu.com/view/5ab820293169a4517723a3ec.html" target="_blank" rel="noopener">1</a>,<a href="https://wenku.baidu.com/view/14ef7c230722192e4536f6f8.html" target="_blank" rel="noopener">2</a>等。</p>
<p><strong>接着在当前目录的<code>commands</code>文件夹填入你要运行的命令。如果你想要运行其他脚本,请在该文件中调用执行</strong>。</p>
<h2 id="命令说明"><a href="#命令说明" class="headerlink" title="命令说明"></a>命令说明</h2><h3 id="sync-command"><a href="#sync-command" class="headerlink" title="sync-command"></a>sync-command</h3><p>这个命令最简单粗暴,直接在<code>sync-command</code>后接你想要在远端执行的命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> sync-command ls -l <span class="string">'~/test'</span></span></span><br><span class="line">总用量 0</span><br><span class="line">-rw-rw-r-- 1 liuxs liuxs 12 1月 30 19:20 job_id</span><br><span class="line">-rw-rw-r-- 1 liuxs liuxs 34 1月 30 19:20 result.txt</span><br><span class="line">-rw-rw-r-- 1 liuxs liuxs 110 1月 29 11:40 test.R</span><br><span class="line">-rwxrw-r-- 1 liuxs liuxs 240 1月 30 19:20 work.sh</span><br></pre></td></tr></table></figure>
<p><strong>需要注意的是如果是想使用类似<code>~</code>这种映射到某个路径的符号,需要添加引号,不然它会被解析为本地地址,那当然会出问题的</strong></p>
<h3 id="sync-upload"><a href="#sync-upload" class="headerlink" title="sync-upload"></a>sync-upload</h3><p>上传文件到远程主机。</p>
<p>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Usage: sync-upload -n local_files -d 'destdir'</span><br></pre></td></tr></table></figure>
<p><code>-n</code>选项后接你要上传的(本地机器)文件/目录路径,<code>-d</code>选项接远程主机上的目录路径。</p>
<p>用法示例:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">==> examples:</span><br><span class="line"> sync-upload -n work.sh -d /public/home/liuxs/test</span><br><span class="line"> or</span><br><span class="line"> sync-upload -n work.sh -d '~/test'</span><br></pre></td></tr></table></figure>
<p>同样注意使用<code>~</code>时需要加引号。</p>
<p><strong>重点注意不支持-n与-d倒过来写,也就是选项是有顺序的</strong>,为什么如此的原因是为了使<code>-n</code>选项后能够接大于1个的路径参数,命令脚本内部利用了<code>-n</code>和<code>-d</code>的位置特点运用正则表达式抓取所有路径名,你可以利用该命令同时上传不止一个文件/目录(也算是有得有失吧)。</p>
<h3 id="sync-download"><a href="#sync-download" class="headerlink" title="sync-download"></a>sync-download</h3><p>从远程主机下载文件到本地机器。</p>
<p>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Usage: sync-download -n 'remote_files' -d localdir</span><br></pre></td></tr></table></figure>
<p>这个命令的使用基本和<code>sync-upload</code>一致。</p>
<p>用法示例:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">==> examples:</span><br><span class="line"> sync-download -n '~/test/*' -d ./test</span><br><span class="line"> or</span><br><span class="line"> sync-download -n /public/home/liuxs/test/* -d ./test</span><br></pre></td></tr></table></figure>
<p><strong>同样地,不支持<code>-n</code>与<code>-d</code>选项顺序反着写。</strong></p>
<h3 id="sync-run"><a href="#sync-run" class="headerlink" title="sync-run"></a>sync-run</h3><p>提交远程主机的作业,内置<code>qsub</code>命令向计算平台提交任务脚本。如果只是想要运行远程脚本或命令,请查看<code>sync-command</code>命令。</p>
<p>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sync-run -f work_script</span><br></pre></td></tr></table></figure>
<p><code>-f</code>选项后接你要运行的<strong>一个</strong>脚本(需要指定脚本的路径哈)。</p>
<p>用法示例:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sync-run -f /home/wsx/work.sh</span><br></pre></td></tr></table></figure>
<h2 id="sync-deploy"><a href="#sync-deploy" class="headerlink" title="sync-deploy"></a>sync-deploy</h2><p>上传文件、提交作业一气呵成。</p>
<p>该命令内置调用<code>sync-upload</code>和<code>sync-run</code>这两个命令,以及其他几个脚本。在进行相关配置后,它可以根据<code>qsub_header</code>和<code>commands</code>两个文本自动生成作业脚本<code>work.sh</code>,上传指定文档(<code>work.sh</code>不指定也会上传),然后提交到任务节点进行运算。</p>
<p>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Usage: sync-deploy -n local_files -d 'destdir'</span><br></pre></td></tr></table></figure>
<p>同样注意<code>~</code>的使用问题,另外,如果你只部署运行<code>work.sh</code>文档,那么请在<code>-n</code>选项后加<code>work.sh</code>,(因为<code>-n</code>选项后不加内容会报错)虽然该文本会被上传两次,但不会影响使用。</p>
<p>一个实例如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> sync-deploy -n work.sh -d <span class="string">'~/test'</span></span></span><br><span class="line">==> command used: scp -pr -P 22 work.sh /home/wsx/working/sync-deploy/src/work.sh [email protected]:~/test</span><br><span class="line">==></span><br><span class="line">work.sh 100% 240 0.2KB/s 00:00</span><br><span class="line">work.sh 100% 240 0.2KB/s 00:00</span><br><span class="line">==> Files upload successfully.</span><br><span class="line"></span><br><span class="line">==> run as batch mode.......</span><br><span class="line">job_id file locate at ~/test/job_id , id is</span><br><span class="line">87728.node1</span><br><span class="line">==></span><br><span class="line">==> The work deploy successfully.</span><br></pre></td></tr></table></figure>
<h3 id="sync-check"><a href="#sync-check" class="headerlink" title="sync-check"></a>sync-check</h3><p>用来查看作业状态。</p>
<p>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Usage: sync-deploy -n id</span><br></pre></td></tr></table></figure>
<p>如果指定<code>-n</code>选项加上作业号,会查询指定的作业状态,如果不指定,会查看所有的作业状态。</p>
<p>任务部署后会返回作业号,刚提交了两个作业,我们来查一查。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> sync-check -n 87730</span></span><br><span class="line">Job ID Name User Time Use S Queue</span><br><span class="line">------------------------- ---------------- --------------- -------- - -----</span><br><span class="line">87730.node1 work.sh liuxs 00:00:00 C normal_3</span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> sync-check -n 87730.node1</span></span><br><span class="line">Job ID Name User Time Use S Queue</span><br><span class="line">------------------------- ---------------- --------------- -------- - -----</span><br><span class="line">87730.node1 work.sh liuxs 00:00:00 C normal_3</span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> sync-check</span></span><br><span class="line">Job ID Name User Time Use S Queue</span><br><span class="line">------------------------- ---------------- --------------- -------- - -----</span><br><span class="line">87729.node1 work.sh liuxs 00:00:00 C normal_3</span><br><span class="line">87730.node1 work.sh liuxs 00:00:00 C normal_3</span><br></pre></td></tr></table></figure>
<h2 id="计算操作实例"><a href="#计算操作实例" class="headerlink" title="计算操作实例"></a>计算操作实例</h2><p>我们来通过一个完整的实例来了解这些命令。</p>
<p><strong>我们的任务是</strong>利用远程的计算平台运行一些shell命令,执行一个R脚本。</p>
<p>该R脚本位于<code>src/</code>的<code>test</code>目录下,这个脚本我们可以看做我们日常工作运行的主脚本。</p>
<p>我们需要准备什么呢?</p>
<p>只需要正确填写<code>qsub_header</code>与<code>commands</code>文档即可。</p>
<p>我们先看看<code>qsub_header</code>的内容:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash">PBS -l nodes=1:ppn=10</span></span><br><span class="line"><span class="meta">#</span><span class="bash">PBS -S /bin/bash</span></span><br><span class="line"><span class="meta">#</span><span class="bash">PBS -j oe</span></span><br><span class="line"><span class="meta">#</span><span class="bash">PBS -q normal_3</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> Please <span class="built_in">set</span> PBS arguments above</span></span><br></pre></td></tr></table></figure>
<p>上述就是一些PBS选项和参数,按你自己的需求和正确写法填写即可。这里测试我就简单地设定了节点与队列。具体参数你可以百度或者参考说明文档前面提供的信息。</p>
<p>再瞧瞧<code>commands</code>文档:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> This job<span class="string">'s working directory</span></span></span><br><span class="line">cd ~/test</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> Following are commands</span></span><br><span class="line">sleep 20</span><br><span class="line">echo "Thi mission is run successfully!!" > ~/test/result.txt</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> call Rscripts</span></span><br><span class="line">Rscript ~/test/test.R > ~/test/result2.txt</span><br></pre></td></tr></table></figure>
<p>这个文档可能是我们工作主要需要修改的地方,这里我们用<code>cd</code>命令设定(作业的)工作目录,为避免任务结束太快,调用<code>sleep</code>命令让机器睡几秒,然后调用<code>echo</code>将一些文字结果传入一个结果文件,最后调用一个R脚本,并将结果传入另一个文件。</p>
<p>R脚本的内容也非常简单,就是输入几行文本:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">print("==>")</span><br><span class="line">print("==> Hello world!!!!!!!!")</span><br><span class="line">print("==> ")</span><br></pre></td></tr></table></figure>
<p>为避免程序找不到或者找错文件,我们最好指定文件所在的全部路径。</p>
<p><strong>让我们开始跑命令吧~</strong></p>
<p>任务方案很简单,我们将<code>test.R</code>上传到远程主机的工作目录下,注意,<code>work.sh</code>也会自动生成并上传,它的内容就是<code>qsub_header</code>与<code>commands</code>的结合体。然后执行<code>work.sh</code>文本,然后将输出结果传回来。</p>
<p>上传与运行可以利用<code>sync-deploy</code>命令一步搞定:<br><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> 利用add_path.sh将命令加入环境路径后,我们可以利用tab补全查找命令</span></span><br><span class="line">wsx@Desktop-berry:~$ sync-</span><br><span class="line">sync-check sync-command sync-deploy sync-download sync-run sync-upload</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> 利用sync-command查看目标路径情况</span></span><br><span class="line">wsx@Desktop-berry:~$ sync-command "ls -al ~/test"</span><br><span class="line">总用量 8</span><br><span class="line">drwxrwxr-x 2 liuxs liuxs 4096 1月 30 23:52 .</span><br><span class="line">drwx------ 10 liuxs liuxs 4096 1月 30 22:51 ..</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> 部署任务到远程</span></span><br><span class="line"></span><br><span class="line">wsx@Desktop-berry:~$ sync-deploy -n ~/working/sync-deploy/src/test/test.R -d '~/test/'</span><br><span class="line">==> command used: scp -pr -P 22 /home/wsx/working/sync-deploy/src/test/test.R /home/wsx/working/sync-deploy/src/work.sh [email protected]:~/test/</span><br><span class="line">==></span><br><span class="line">test.R 100% 60 0.1KB/s 00:00</span><br><span class="line">work.sh 100% 300 0.3KB/s 00:00</span><br><span class="line">==> Files upload successfully.</span><br><span class="line"></span><br><span class="line">==> run as batch mode.......</span><br><span class="line">job_id file locate at ~/test/job_id , id is</span><br><span class="line">87732.node1</span><br><span class="line">==></span><br><span class="line">==> The work deploy successfully.</span><br></pre></td></tr></table></figure></p>
<p>可以看到任务成功部署并返回了<code>job id</code>,利用<code>sync-check</code>命令查询</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@Desktop-berry:~$ sync-check 87732</span><br><span class="line">Job ID Name User Time Use S Queue</span><br><span class="line">------------------------- ---------------- --------------- -------- - -----</span><br><span class="line">87732.node1 work.sh liuxs 00:00:00 C normal_3</span><br></pre></td></tr></table></figure>
<p>因为任务时间不长,很快就搞定了,已经出现了<code>C</code>标志(完成)。</p>
<p>我们查看一下远程目录情况:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">wsx@Desktop-berry:~$ sync-command ls '~/test'</span><br><span class="line">job_id</span><br><span class="line">result2.txt</span><br><span class="line">result.txt</span><br><span class="line">test.R</span><br><span class="line">work.sh</span><br></pre></td></tr></table></figure>
<p><code>job_id</code>文件是用来保存作业号信息的,就是前面输出的<code>87732.node1</code>。其他不用解释了。</p>
<p>最后一步,将需要的结果下载回本地。</p>
<p>我们创建一个临时目录单独存储,然后查看文件内容:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">wsx@Desktop-berry:~$ mkdir test</span><br><span class="line">wsx@Desktop-berry:~$ sync-download -n "~/test/*" -d ~/test</span><br><span class="line">==> command used: scp -pr -P 22 [email protected]:~/test/* /home/wsx/test</span><br><span class="line">==></span><br><span class="line">job_id 100% 12 0.0KB/s 00:00</span><br><span class="line">result2.txt 100% 51 0.1KB/s 00:00</span><br><span class="line">result.txt 100% 34 0.0KB/s 00:00</span><br><span class="line">test.R 100% 60 0.1KB/s 00:00</span><br><span class="line">work.sh 100% 300 0.3KB/s 00:00</span><br><span class="line">==> Files download successfully.</span><br><span class="line"></span><br><span class="line">wsx@Desktop-berry:~$ cd test/</span><br><span class="line">wsx@Desktop-berry:~/test$ ls</span><br><span class="line">job_id result2.txt result.txt test.R work.sh</span><br><span class="line">wsx@Desktop-berry:~/test$ cat result.txt</span><br><span class="line">Thi mission is run successfully!!</span><br><span class="line">wsx@Desktop-berry:~/test$ cat result2.txt</span><br><span class="line">[1] "==>"</span><br><span class="line">[1] "==> Hello world!!!!!!!!"</span><br><span class="line">[1] "==> "</span><br></pre></td></tr></table></figure>
<p><strong>任务完成!</strong></p>
<h2 id="问题"><a href="#问题" class="headerlink" title="问题"></a>问题</h2><p>有问题欢迎<a href="https://github.com/ShixiangWang/sync-deploy/issues" target="_blank" rel="noopener">提交issue</a>进行讨论。</p>
]]></content>
<categories>
<category> 开源工具 </category>
</categories>
</entry>
<entry>
<title><![CDATA[初识sed与awk]]></title>
<url>/2017/12/25/sed-and-gawk/</url>
<content type="html"><![CDATA[<p><strong>学习内容</strong>:</p>
<blockquote>
<ul>
<li>学习sed编辑器</li>
<li>gawk编辑器入门</li>
<li>sed编辑器基础</li>
</ul>
</blockquote>
<p>shell脚本最常见的一个用途就是处理文本文件,但仅靠shell脚本命令来处理文本文件的内容有点勉为其难。如果我们想在shell脚本中处理任何类型的数据,需要熟悉Linux中的sed和gawk工具。这两个工具可以极大简化我们需要进行的数据处理任务。</p>
<a id="more"></a>
<h2 id="文本处理"><a href="#文本处理" class="headerlink" title="文本处理"></a>文本处理</h2><p>当我们需要自动处理文本文件,又不想动用交互式文本编辑器时,sed和gawk是我们最好的选择。</p>
<h3 id="sed编辑器"><a href="#sed编辑器" class="headerlink" title="sed编辑器"></a>sed编辑器</h3><p>也被称为<strong>流编辑器</strong>(stream editor),会在编辑器处理数据之前<strong>基于预先提供的一组规则</strong>来编辑数据流。</p>
<p>sed编辑器可以根据命令来处理数据流中的数据,这些命令既可以从终端输入,也可以存储进脚本文件中。</p>
<p>sed会执行以下的操作:</p>
<ul>
<li>一次从输入中读取一行数据</li>
<li>根据所提供的命令匹配数据</li>
<li>按照命令修改流中的数据</li>
<li>将新的数据输出到STDOUT</li>
</ul>
<p>这一过程会重复直至处理完流中的所有数据行。</p>
<p>sed命令的格式如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sed options script file</span><br></pre></td></tr></table></figure>
<p>选项<code>options</code>可以允许我们修改<code>sed</code>命令的行为</p>
<table>
<thead>
<tr>
<th>选项</th>
<th>描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>-e script</td>
<td>在处理输入时,将script中指定的命令添加到已有的命令中</td>
</tr>
<tr>
<td>-f file</td>
<td>在处理输入时,将file中指定的命令添加到已有的命令中</td>
</tr>
<tr>
<td>-n</td>
<td>不产生命令输出,使用<code>print</code>命令来完成输出</td>
</tr>
</tbody>
</table>
<p><code>script</code>参数指定用于流数据上的单个命令,如果需要多个命令,要么使用<code>-e</code>选项在命令行中指定,要么使用<code>-f</code>选项在单独的文件中指定。</p>
<h4 id="在命令行中定义编辑器命令"><a href="#在命令行中定义编辑器命令" class="headerlink" title="在命令行中定义编辑器命令"></a>在命令行中定义编辑器命令</h4><p>默认sed会将指定命令应用到STDIN输入流上,我们可以配合管道命令使用。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ echo "This is a test" | sed 's/test/big test/'</span><br><span class="line">This is a big test</span><br></pre></td></tr></table></figure>
<p><code>s</code>命令使用斜线间指定的第二个文本来替换第一个文本字符串模式(注意是替换整个模式,支持正则匹配),比如这个例子用<code>big test</code>替换了<code>test</code>。</p>
<p>假如有以下文本:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br></pre></td></tr></table></figure>
<p>键入命令,查看输出</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed 's/dog/cat/' data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br></pre></td></tr></table></figure>
<p>可以看到符合模式的字符串都被修改了。</p>
<p>要记住,sed并不会修改文本文件的数据,<strong>它只会将修改后的数据发送到STDOUT</strong>。</p>
<h4 id="在命令行上使用多个编辑器命令"><a href="#在命令行上使用多个编辑器命令" class="headerlink" title="在命令行上使用多个编辑器命令"></a>在命令行上使用多个编辑器命令</h4><p>使用<code>-e</code>选项可以执行多个命令</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed -e 's/brown/green/; s/dog/cat/' data1.txt</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br><span class="line">The quick green fox jumps over the lazy cat.</span><br></pre></td></tr></table></figure>
<p>两个命令都作用到文件中的每一行数据上。命令之间必须用分号隔开,并且<strong>在命令末尾与分号之间不同有空格</strong>。</p>
<p>如果不想使用分号,可以用bash shell中的次提示符来分隔命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed -e '</span><br><span class="line"><span class="meta">></span><span class="bash"> s/brown/green/</span></span><br><span class="line"><span class="meta">></span><span class="bash"> s/fox/elephant/</span></span><br><span class="line"><span class="meta">></span><span class="bash"> s/dog/cat/<span class="string">' data1.txt</span></span></span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br></pre></td></tr></table></figure>
<h4 id="从文件中读取编辑器命令"><a href="#从文件中读取编辑器命令" class="headerlink" title="从文件中读取编辑器命令"></a>从文件中读取编辑器命令</h4><p>如果有大量要处理的sed命令,将其单独放入一个文本中会更方便,可以用sed命令的<code>-f</code>选项来指定文件。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat script1.sed</span><br><span class="line">s/brown/green/</span><br><span class="line">s/fox/elephant/</span><br><span class="line">s/dog/cat/</span><br><span class="line"></span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed -f script1.sed data1.txt</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br><span class="line">The quick green elephant jumps over the lazy cat.</span><br></pre></td></tr></table></figure>
<p>这种情况不用在每个命令后面放一个分号,sed知道每行都有一条单独的命令。</p>
<h3 id="gawk程序"><a href="#gawk程序" class="headerlink" title="gawk程序"></a>gawk程序</h3><p>gawk是一个处理文本的更高级工具,能够提供一个类编程环境来修改和重新组织文件中的数据。</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">说明 在所有的发行版都没有默认安装gawk程序,请先安装</span><br></pre></td></tr></table></figure>
<p>gawk程序是Unix中原始awk的GNU版本,它让流编辑器迈上了一个新的台阶,提供了一种编程语言而不只是编辑器命令。</p>
<p>我们可以利用它做下面的事情:</p>
<ul>
<li>定义变量来保存数据</li>
<li>使用算术和字符串操作符来处理数据</li>
<li>使用结构化编程概念来为数据处理增加处理逻辑</li>
<li>通过提取数据文件中的数据元素,将其重新排列或格式化,生成格式化报告</li>
</ul>
<p>gawk程序的报告生成能力通常用来从大文本文件中提取数据元素,并将它们格式化成可读的报告,使得重要的数据更易于可读。</p>
<h4 id="基本命令格式"><a href="#基本命令格式" class="headerlink" title="基本命令格式"></a>基本命令格式</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gawk options program file</span><br></pre></td></tr></table></figure>
<p>下面显示了gawk程序的可用选项</p>
<table>
<thead>
<tr>
<th>选项</th>
<th>描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>-F fs</td>
<td>指定行中划分数据字段的字段分隔符</td>
</tr>
<tr>
<td>-f file</td>
<td>从指定文件中读取程序</td>
</tr>
<tr>
<td>-v var=value</td>
<td>定义gawk程序中的一个变量及其默认值</td>
</tr>
<tr>
<td>-mf N</td>
<td>指定要处理的数据文件中的最大字段数</td>
</tr>
<tr>
<td>-mr N</td>
<td>指定数据文件中的最大数据行数</td>
</tr>
<tr>
<td>-W keyword</td>
<td>指定gawk的兼容模式或警告等级</td>
</tr>
</tbody>
</table>
<p>gawk的<strong>强大之处在于程序脚本</strong>(善于利用工具最强之处),可以写脚本来读取文本行的数据,然后处理并显示数据,创建任何类型的输出报告。</p>
<h4 id="从命令行读取脚本"><a href="#从命令行读取脚本" class="headerlink" title="从命令行读取脚本"></a>从命令行读取脚本</h4><p>我们必须将脚本命令放入两个花括号中,而由于gawk命令行假定脚本是单个文本字符串,所以我们必须把脚本放到单引号中。</p>
<p>下面是一个简单的例子:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ gawk '{print "Hello World!"}'</span><br><span class="line"></span><br><span class="line">Hello World!</span><br><span class="line">This is a test</span><br><span class="line">Hello World!</span><br><span class="line">This is</span><br><span class="line">Hello World!</span><br><span class="line"></span><br><span class="line">Hello World!</span><br></pre></td></tr></table></figure>
<p><code>print</code>命令将文本打印到STDOUT。如果尝试允许命令,我们可能会有些失望,因为什么都不会发生,原因是没有指定文件名,所以gawk会从STDIN接收数据,如果我们按下回车,gawk会对这行文本允许一遍程序脚本。</p>
<p>要终止这个程序必须表明数据流已经结束了,bash shell提供组合键来生成EOF(End-of-File)字符。Ctrl+D组合键会在bash中产生一个EOF字符。</p>
<h4 id="使用数据字段变量"><a href="#使用数据字段变量" class="headerlink" title="使用数据字段变量"></a>使用数据字段变量</h4><p>gawk的主要特性之一是其处理文本文件中数据的能力,它自动给一行的每个数据元素分配一个变量。</p>
<ul>
<li>$0代表整个文本行</li>
<li>$1代表文本行的第一个数据字段</li>
<li>$2代表文本行的第二个数据字段</li>
<li>$n代表文本行的第n个数据字段</li>
</ul>
<p>gawk在读取一行文本时,会用预定义的字段分隔符划分每个数据字段。默认字段分隔符为任意的空白字符(例如空格或制表符)。</p>
<p>下面例子gawk读取文本显示第一个数据字段的值。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data2.txt</span><br><span class="line">One line of test text.</span><br><span class="line">Two lines of test text.</span><br><span class="line">Three lines of test text.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ gawk '{print $1}' data2.txt</span><br><span class="line">One</span><br><span class="line">Two</span><br><span class="line">Three</span><br></pre></td></tr></table></figure>
<p>我们可以使用<code>-F</code>选项指定其他字段分隔符:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ gawk -F: '{print $1}' /etc/passwd</span><br><span class="line">root</span><br><span class="line">daemon</span><br><span class="line">bin</span><br><span class="line">sys</span><br><span class="line">sync</span><br><span class="line">games</span><br><span class="line">man</span><br><span class="line">lp</span><br><span class="line">mail</span><br><span class="line">news</span><br><span class="line">uucp</span><br><span class="line">proxy</span><br><span class="line">www-data</span><br><span class="line">backup</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<p>这个简短程序显示了系统中密码文件的第一个数据字段。</p>
<h4 id="在程序脚本中使用多个命令"><a href="#在程序脚本中使用多个命令" class="headerlink" title="在程序脚本中使用多个命令"></a>在程序脚本中使用多个命令</h4><p>在命令之间放个分号即可。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ echo "My name is Shixiang" | gawk '{$4="Christine"; print $0}'</span><br><span class="line">My name is Christine</span><br></pre></td></tr></table></figure>
<p>也可以使用次提示符一次一行输入程序脚本命令(类似sed)。</p>
<h4 id="从文件中读取程序"><a href="#从文件中读取程序" class="headerlink" title="从文件中读取程序"></a>从文件中读取程序</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat script2.gawk</span><br><span class="line">{print $1 " 's home directory is " $6}</span><br><span class="line">wsx@wsx-laptop:~/tmp$ gawk -F: -f script2.gawk /etc/passwd</span><br><span class="line">root 's home directory is /root</span><br><span class="line">daemon 's home directory is /usr/sbin</span><br><span class="line">bin 's home directory is /bin</span><br><span class="line">sys 's home directory is /dev</span><br><span class="line">sync 's home directory is /bin</span><br><span class="line">games 's home directory is /usr/games</span><br><span class="line">man 's home directory is /var/cache/man</span><br><span class="line">lp 's home directory is /var/spool/lpd</span><br><span class="line">mail 's home directory is /var/mail</span><br><span class="line">news 's home directory is /var/spool/news</span><br><span class="line">uucp 's home directory is /var/spool/uucp</span><br><span class="line">proxy 's home directory is /bin</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<p>可以在程序文件中指定多条命令:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat script3.gawk</span><br><span class="line">{</span><br><span class="line">text = "'s home directory is "</span><br><span class="line">print $1 text $6</span><br><span class="line">}</span><br><span class="line">wsx@wsx-laptop:~/tmp$ gawk -F: -f script3.gawk /etc/passwd</span><br><span class="line">root's home directory is /root</span><br><span class="line">daemon's home directory is /usr/sbin</span><br><span class="line">bin's home directory is /bin</span><br><span class="line">sys's home directory is /dev</span><br><span class="line">sync's home directory is /bin</span><br><span class="line">games's home directory is /usr/games</span><br><span class="line">man's home directory is /var/cache/man</span><br><span class="line">lp's home directory is /var/spool/lpd</span><br><span class="line">mail's home directory is /var/mail</span><br><span class="line">news's home directory is /var/spool/news</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<h4 id="在处理数据前运行脚本"><a href="#在处理数据前运行脚本" class="headerlink" title="在处理数据前运行脚本"></a>在处理数据前运行脚本</h4><p>使用BEGIN关键字可以强制gawk再读取数据前执行BEGIN关键字指定的程序脚本。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data3.txt</span><br><span class="line">Line 1</span><br><span class="line">Line 2</span><br><span class="line">Line 3</span><br><span class="line">wsx@wsx-laptop:~/tmp$ gawk 'BEGIN {print "The data3 File Contents:"}</span><br><span class="line"><span class="meta">></span><span class="bash"> {<span class="built_in">print</span> <span class="variable">$0</span>}<span class="string">' data3.txt</span></span></span><br><span class="line">The data3 File Contents:</span><br><span class="line">Line 1</span><br><span class="line">Line 2</span><br><span class="line">Line 3</span><br></pre></td></tr></table></figure>
<p>在gawk执行了BEGIN脚本后,它会用第二段脚本来处理文件数据。</p>
<h4 id="在处理数据后允许脚本"><a href="#在处理数据后允许脚本" class="headerlink" title="在处理数据后允许脚本"></a>在处理数据后允许脚本</h4><p>与BEGIN关键字类似,END关键字允许我们指定一个脚本,gawk在读完数据后执行。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ gawk 'BEGIN {print "The data3 File Contents:"}</span><br><span class="line"><span class="meta">></span><span class="bash"> {<span class="built_in">print</span> <span class="variable">$0</span>}</span></span><br><span class="line"><span class="meta">></span><span class="bash"> END {<span class="built_in">print</span> <span class="string">"End of File"</span>}<span class="string">' data3.txt</span></span></span><br><span class="line">The data3 File Contents:</span><br><span class="line">Line 1</span><br><span class="line">Line 2</span><br><span class="line">Line 3</span><br><span class="line">End of File</span><br></pre></td></tr></table></figure>
<p>我们把所有的内容放在一起组成一个漂亮的小程序脚本,用它从简单的数据文件中创建一份完整报告。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat script4.gawk</span><br><span class="line">BEGIN {</span><br><span class="line">print "The latest list of users and shells"</span><br><span class="line">print " UserID \t Shell"</span><br><span class="line">print "-------- \t ------"</span><br><span class="line">FS=":"</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">{</span><br><span class="line">print $1 " \t " $7</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">END {</span><br><span class="line">print "This concludes the listing"</span><br><span class="line">}</span><br><span class="line">wsx@wsx-laptop:~/tmp$ gawk -f script4.gawk /etc/passwd</span><br><span class="line">The latest list of users and shells</span><br><span class="line"> UserID Shell</span><br><span class="line">-------- ------</span><br><span class="line">root /bin/bash</span><br><span class="line">daemon /usr/sbin/nologin</span><br><span class="line">bin /usr/sbin/nologin</span><br><span class="line">sys /usr/sbin/nologin</span><br><span class="line">sync /bin/sync</span><br><span class="line">games /usr/sbin/nologin</span><br><span class="line">man /usr/sbin/nologin</span><br><span class="line">lp /usr/sbin/nologin</span><br><span class="line">mail /usr/sbin/nologin</span><br><span class="line">news /usr/sbin/nologin</span><br><span class="line">uucp /usr/sbin/nologin</span><br><span class="line">proxy /usr/sbin/nologin</span><br><span class="line">www-data /usr/sbin/nologin</span><br><span class="line">backup /usr/sbin/nologin</span><br><span class="line">list /usr/sbin/nologin</span><br><span class="line">irc /usr/sbin/nologin</span><br><span class="line">gnats /usr/sbin/nologin</span><br><span class="line">nobody /usr/sbin/nologin</span><br><span class="line">systemd-timesync /bin/false</span><br><span class="line">systemd-network /bin/false</span><br><span class="line">systemd-resolve /bin/false</span><br><span class="line">systemd-bus-proxy /bin/false</span><br><span class="line">syslog /bin/false</span><br><span class="line">_apt /bin/false</span><br><span class="line">lxd /bin/false</span><br><span class="line">messagebus /bin/false</span><br><span class="line">uuidd /bin/false</span><br><span class="line">dnsmasq /bin/false</span><br><span class="line">sshd /usr/sbin/nologin</span><br><span class="line">pollinate /bin/false</span><br><span class="line">wsx /bin/bash</span><br><span class="line">This concludes the listing</span><br></pre></td></tr></table></figure>
<p>我们以后会继续学习gawk高级编程。</p>
<h2 id="sed编辑器基础"><a href="#sed编辑器基础" class="headerlink" title="sed编辑器基础"></a>sed编辑器基础</h2><p>下面介绍一些可以集成到脚本中的基本命令和功能。</p>
<h3 id="更多的替换选项"><a href="#更多的替换选项" class="headerlink" title="更多的替换选项"></a>更多的替换选项</h3><p>之前我们已经学习了用<code>s</code>命令在行中替换文本,这个命令还有一些其他选项。</p>
<h4 id="替换标记"><a href="#替换标记" class="headerlink" title="替换标记"></a>替换标记</h4><p>替换命令<code>s</code>默认只替换每行中出现的第一处。要让该命令能替换一行中不同地方出现的文本必须使用<strong>替换标记</strong>。该标记在替换命令字符串之后设置。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">s/pattern/replacement/flags</span><br></pre></td></tr></table></figure>
<p>替换标记有4种:</p>
<ul>
<li>数字,表明替换第几处模式匹配的地方</li>
<li>g,表明替换所有匹配的文本</li>
<li>p,表明原先行的内容要打印出来</li>
<li>w file,将替换的结果写入文件中</li>
</ul>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data4.txt</span><br><span class="line">This is a test of the test script.</span><br><span class="line">This is the second test of the test script.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed 's/test/trial/2' data4.txt</span><br><span class="line">This is a test of the trial script.</span><br><span class="line">This is the second test of the trial script.</span><br></pre></td></tr></table></figure>
<p>该命令只替换每行中第二次出现的匹配模式。而<code>g</code>标记替换所有的匹配之处。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed 's/test/trial/g' data4.txt</span><br><span class="line">This is a trial of the trial script.</span><br><span class="line">This is the second trial of the trial script.</span><br></pre></td></tr></table></figure>
<p><code>p</code>替换标记会打印与替换命令中指定的模式匹配的行,通常与sed的<code>-n</code>选项一起使用。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data5.txt</span><br><span class="line">This is a test line.</span><br><span class="line">This is a different line.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed -n 's/test/trial/p' data5.txt</span><br><span class="line">This is a trial line.</span><br></pre></td></tr></table></figure>
<p><code>-n</code>选项禁止sed编辑器输出,但<code>p</code>标记会输出修改过的行。两者配合使用就是<strong>只输出被替换命令修改过的行</strong>。</p>
<p><code>w</code>标记会产生同样的输出,不过会将输出(只输出被替换命令修改过的行)保存到指定文件中。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed 's/test/trial/w test.txt' data5.txt</span><br><span class="line">This is a trial line.</span><br><span class="line">This is a different line.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ cat test.txt</span><br><span class="line">This is a trial line.</span><br></pre></td></tr></table></figure>
<h4 id="替换字符"><a href="#替换字符" class="headerlink" title="替换字符"></a>替换字符</h4><p>有一些字符不方便在替换模式中使用,常见的例子为正斜线。</p>
<p>替换文件中的路径名会比较麻烦,比如用C shell替换/etc/passwd文件中的bash shell,必须这样做(通过反斜线转义):</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ head /etc/passwd</span><br><span class="line">root:x:0:0:root:/root:/bin/bash</span><br><span class="line">daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin</span><br><span class="line">...</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed 's/\/bin\/bash/\/bin\/csh/' /etc/passwd</span><br><span class="line">root:x:0:0:root:/root:/bin/csh</span><br><span class="line">daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin</span><br><span class="line">bin:x:2:2:bin:/bin:/usr/sbin/nologin</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<p>为解决这样的问题,sed编辑器允许选择其他字符来替换命令中的字符串分隔符:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed 's!/bin/bash!/bin/csh!' /etc/passwd</span><br><span class="line">root:x:0:0:root:/root:/bin/csh</span><br><span class="line">daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<h3 id="使用地址"><a href="#使用地址" class="headerlink" title="使用地址"></a>使用地址</h3><p>如果只想要命令作用于特定行或某些行,必须使用<strong>行寻址</strong>。</p>
<p>有两种形式:</p>
<ul>
<li>以数字形式表示行区间</li>
<li>用文本模式来过滤出行</li>
</ul>
<p>它们都使用相同地格式来指定地址:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[address]command</span><br></pre></td></tr></table></figure>
<p>也可以将多个命令分组</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">address {</span><br><span class="line"> command1</span><br><span class="line"> command2</span><br><span class="line"> command3</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<h4 id="以数字的方式行寻址"><a href="#以数字的方式行寻址" class="headerlink" title="以数字的方式行寻址"></a>以数字的方式行寻址</h4><p>sed编辑器会将文本流中的第一行编号为1,然后继续按顺序给以下行编号。</p>
<p>指定的地址<strong>可以是单个行号,或者用行号、逗号以及结尾行号指定的一定区间范围的行</strong>。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '2s/dog/cat/' data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '2,3s/dog/cat/' data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '2,$s/dog/cat/' data1.txt # 美元符指代最后一行</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy cat.</span><br></pre></td></tr></table></figure>
<h4 id="使用文本模式过滤器"><a href="#使用文本模式过滤器" class="headerlink" title="使用文本模式过滤器"></a>使用文本模式过滤器</h4><p>sed允许指定文本模式来过滤出命令要作用的行,格式如下:</p>
<figure class="highlight livecodeserver"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/pattern/<span class="keyword">command</span></span><br></pre></td></tr></table></figure>
<p>比如我要修改默认的shell,可以使用sed命令:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ grep wsx /etc/passwd</span><br><span class="line">wsx:x:1000:1000:"",,,:/home/wsx:/bin/bash</span><br><span class="line">wsx@wsx-laptop:~/tmp$ grep '/wsx/s/bash/csh/' /etc/passwd</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '/wsx/s/bash/csh/' /etc/passwd</span><br><span class="line">root:x:0:0:root:/root:/bin/bash</span><br><span class="line">daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin</span><br><span class="line">bin:x:2:2:bin:/bin:/usr/sbin/nologin</span><br><span class="line">...</span><br><span class="line">wsx:x:1000:1000:"",,,:/home/wsx:/bin/csh</span><br></pre></td></tr></table></figure>
<p>正则表达式允许创建高级文本模式匹配表达式来匹配各种数据,结合一系列通配符、特殊字符来生成几乎任何形式文本的简练模式。我们后续会学习到。</p>
<h4 id="命令组合"><a href="#命令组合" class="headerlink" title="命令组合"></a>命令组合</h4><p>使用花括号可以将多条命令组合在一起。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '2{</span><br><span class="line"><span class="meta">></span><span class="bash"> s/fox/elephant/</span></span><br><span class="line"><span class="meta">></span><span class="bash"> s/dog/cat/</span></span><br><span class="line"><span class="meta">></span><span class="bash"> }<span class="string">' data1.txt</span></span></span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown elephant jumps over the lazy cat.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br></pre></td></tr></table></figure>
<p>也可以在一组命令前指定一个地址区间。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '3,${</span><br><span class="line">s/brown/green/</span><br><span class="line">s/lazy/active/</span><br><span class="line">}' data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br><span class="line">The quick green fox jumps over the active dog.</span><br></pre></td></tr></table></figure>
<h3 id="删除行"><a href="#删除行" class="headerlink" title="删除行"></a>删除行</h3><p>如果需要删除文本流中的特定行,使用删除命令<code>d</code>,它会删除匹配指定寻址模式的所有行。<strong>使用时要特别小心</strong>,如果忘记加入寻址模式,会将所有文本行删除。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed 'd' data1.txt</span><br></pre></td></tr></table></figure>
<p>和指定的地址一起使用才能发挥删除命令的最大功用。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '3d' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>通过特定行区间指定:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '2,3d' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>通过特殊文本结尾字符指定:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '2,$d' data6.txt</span><br><span class="line">This is line number 1.</span><br></pre></td></tr></table></figure>
<p>还可以使用模式匹配特性:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '/number 1/d' data6.txt</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>sed会删除包含匹配模式的行。</p>
<p>记住,sed不会修改原始文件。</p>
<p>还可以使用两个文本模式来删除某个区间内的行,但做的时候需要特别小心,指定的第一个模式会“打开”行删除功能,第二个模式会“关闭”行删除功能。sed会删除两个指定行之间的所有行(包括指定行)。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data7.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">This is line number 1 again.</span><br><span class="line">This is text you want to keep.</span><br><span class="line">This is the last line in the file.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '/1/,/3/d' data7.txt</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>第二个出现的数字“1”的行再次触发了删除命令,因为未能找到停止模式“3”,所以将数据流剩余的行全部删掉了。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '/1/,/5/d' data7.txt</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '/2/,/4/d' data7.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 1 again.</span><br><span class="line">This is text you want to keep.</span><br><span class="line">This is the last line in the file.</span><br></pre></td></tr></table></figure>
<h3 id="插入和附加文本"><a href="#插入和附加文本" class="headerlink" title="插入和附加文本"></a>插入和附加文本</h3><p>sed允许向数据流插入和附加文本行:</p>
<ul>
<li>插入命令<code>i</code>会在指定行前增加一个新行</li>
<li>附加命令<code>a</code>会在指定行后增加一个新行</li>
</ul>
<p>注意,它们不能在单个命令行上使用,必须要指定是要插入还是要附加到的那一行。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ echo "Test Line 2" | sed 'i\Test Line 1'</span><br><span class="line">Test Line 1</span><br><span class="line">Test Line 2</span><br><span class="line">wsx@wsx-laptop:~/tmp$ echo "Test Line 2" | sed 'a\Test Line 1'</span><br><span class="line">Test Line 2</span><br><span class="line">Test Line 1</span><br></pre></td></tr></table></figure>
<p>要向数据流行内部插入或附加数据,必须用寻址来告诉sed数据应该出现在什么位置。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '3i\ This is an inserted line.' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line"> This is an inserted line.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '3a\ This is an inserted line.' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line"> This is an inserted line.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>如果想要给数据流末尾添加多行数据,通过<code>$</code>指定位置即可。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line"> This is a new line.</span><br></pre></td></tr></table></figure>
<h3 id="修改行"><a href="#修改行" class="headerlink" title="修改行"></a>修改行</h3><p>修改(change)命令允许修改整个数据流中整行文本内容。它跟插入和附加命令的工作机制一样。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '3c\This is a changed line.' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is a changed line.</span><br><span class="line">This is line number 4.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '/number 3/c\This is a changed line.' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is a changed line.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<h3 id="转换命令"><a href="#转换命令" class="headerlink" title="转换命令"></a>转换命令</h3><p>转换命令(y)是<strong>唯一可以处理单字符的sed命令</strong>。格式如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[address]y/inchars/outchars</span><br></pre></td></tr></table></figure>
<p>转换命令会对<code>inchars</code>和<code>outchars</code>值进行一对一的映射。如果两者字符长度不同,则sed产生一条错误信息。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed 'y/123/789/' data6.txt</span><br><span class="line">This is line number 7.</span><br><span class="line">This is line number 8.</span><br><span class="line">This is line number 9.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>转换命令是一个全局命令,<strong>它会在文本行中找到的所有指定字符自动进行转换,而不会考虑它们出现的位置</strong>。</p>
<h3 id="回顾命令"><a href="#回顾命令" class="headerlink" title="回顾命令"></a>回顾命令</h3><p>另有3个命令可以用来打印数据流中的信息:</p>
<ul>
<li><code>p</code>命令用来打印文本行</li>
<li>等号<code>=</code>命令用来打印行号</li>
<li><code>l</code>用来列出行</li>
</ul>
<h4 id="打印行"><a href="#打印行" class="headerlink" title="打印行"></a>打印行</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ echo "this is a test" | sed 'p'</span><br><span class="line">this is a test</span><br><span class="line">this is a test</span><br></pre></td></tr></table></figure>
<p><code>p</code>打印已有的数据文本。最常用的用法是打印符合匹配文本模式的行。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed -n '/number 3/p' data6.txt</span><br><span class="line">This is line number 3.</span><br></pre></td></tr></table></figure>
<p>在命令行上使用<code>-n</code>选项,可以禁止输出其他行,只打印包含匹配文本模式的行。</p>
<p>也可以用来快速打印数据流中的某些行:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed -n '2,3p' data6.txt</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br></pre></td></tr></table></figure>
<h4 id="打印行号"><a href="#打印行号" class="headerlink" title="打印行号"></a>打印行号</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data1.txt</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '=' data1.txt</span><br><span class="line">1</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">2</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">3</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">4</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">5</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">6</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">7</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">8</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br><span class="line">9</span><br><span class="line">The quick brown fox jumps over the lazy dog.</span><br></pre></td></tr></table></figure>
<p>这用来查找特定文本模式的话非常方便:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed -n '/number 4/{</span><br><span class="line"><span class="meta">></span><span class="bash"> =</span></span><br><span class="line"><span class="meta">></span><span class="bash"> p</span></span><br><span class="line"><span class="meta">></span><span class="bash"> }<span class="string">' data6.txt</span></span></span><br><span class="line">4</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<h4 id="列出行"><a href="#列出行" class="headerlink" title="列出行"></a>列出行</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data9.txt</span><br><span class="line">This line contains tabs.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed -n 'l' data9.txt</span><br><span class="line">This\tline\tcontains\ttabs.$</span><br></pre></td></tr></table></figure>
<h3 id="使用Sed处理文件"><a href="#使用Sed处理文件" class="headerlink" title="使用Sed处理文件"></a>使用Sed处理文件</h3><h4 id="写入文件"><a href="#写入文件" class="headerlink" title="写入文件"></a>写入文件</h4><p><code>w</code>命令用来向文件写入行。该命令格式如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[address]w filename</span><br></pre></td></tr></table></figure>
<p>将文本的前两行写入其他文件:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '1,2w test.txt' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ cat test.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br></pre></td></tr></table></figure>
<p>如果不想让行显示到STDOUT(因为sed默认数据文本流),可以使用sed命令的<code>-n</code>选项。</p>
<h4 id="读取数据"><a href="#读取数据" class="headerlink" title="读取数据"></a>读取数据</h4><p>读取命令为<code>r</code>。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat data12.txt</span><br><span class="line">This is an added line.</span><br><span class="line">This is the second added line.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '3r data12.txt' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is an added line.</span><br><span class="line">This is the second added line.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>这效果有点像插入文本命令<code>i</code>和补充命令<code>a</code>。</p>
<p> 同样适用于文本模式地址:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '/number 2/r data12.txt' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is an added line.</span><br><span class="line">This is the second added line.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br></pre></td></tr></table></figure>
<p>文本末尾添加:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '$r data12.txt' data6.txt</span><br><span class="line">This is line number 1.</span><br><span class="line">This is line number 2.</span><br><span class="line">This is line number 3.</span><br><span class="line">This is line number 4.</span><br><span class="line">This is an added line.</span><br><span class="line">This is the second added line.</span><br></pre></td></tr></table></figure>
<p><strong>读取命令的一个很酷的用法是和删除命令配合使用:利用另一个文件中的数据来替换文件中的占位文本</strong>。假如你有一份套用信件保存在文本中:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ cat notice.std</span><br><span class="line">Would the following people:</span><br><span class="line">LIST</span><br><span class="line">please report to the ship's captain.</span><br></pre></td></tr></table></figure>
<p>套用信件将通用占位文本<code>LIST</code>放在人物名单的位置,我们先根据它插入文本字符,然后删除它。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx-laptop:~/tmp$ sed '/LIST/{</span><br><span class="line"><span class="meta">></span><span class="bash"> r data10.txt</span></span><br><span class="line"><span class="meta">></span><span class="bash"> d</span></span><br><span class="line"><span class="meta">></span><span class="bash"> }<span class="string">' notice.std</span></span></span><br><span class="line">Would the following people:</span><br><span class="line">This line contains an escape character.</span><br><span class="line">please report to the ship's captain.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ cat data10.txt</span><br><span class="line">This line contains an escape character.</span><br><span class="line">wsx@wsx-laptop:~/tmp$ cat data11.txt</span><br><span class="line">wangshx zhdan</span><br><span class="line">wsx@wsx-laptop:~/tmp$ sed '/LIST/{</span><br><span class="line">r data11.txt</span><br><span class="line">d</span><br><span class="line">}' notice.std</span><br><span class="line">Would the following people:</span><br><span class="line">wangshx zhdan</span><br><span class="line">please report to the ship's captain.</span><br></pre></td></tr></table></figure>
<p>可以看到占位符被替换成了数据文件中的文字。</p>
<p>完。</p>
]]></content>
<categories>
<category> Linux杂烩 </category>
<category> 文本处理 </category>
</categories>
<tags>
<tag> shell笔记 </tag>
<tag> linux </tag>
</tags>
</entry>
<entry>
<title><![CDATA[学习git]]></title>
<url>/2017/12/08/Git-basic-operation/</url>
<content type="html"><![CDATA[<p>纯属搬砖操作,资料来源《Github入门与实战》,这本书的重要信息也就这些了,需要的时候找一找。</p>
<p>书上提到的一个学习网站<a href="https://learngitbranching.js.org/" target="_blank" rel="noopener">https://learngitbranching.js.org/</a>非常棒,线上学习。</p>
<a id="more"></a>
<h1 id="Git基本操作"><a href="#Git基本操作" class="headerlink" title="Git基本操作"></a>Git基本操作</h1><h2 id="git-init——初始化仓库"><a href="#git-init——初始化仓库" class="headerlink" title="git init——初始化仓库"></a>git init——初始化仓库</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> mkdir git-tutorial</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> <span class="built_in">cd</span> git-tutorial</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> git init</span></span><br><span class="line">Initialized empty Git repository in /Users/hirocaster/github/github-book</span><br><span class="line">/git-tutorial/.git/</span><br></pre></td></tr></table></figure>
<p>如果初始化成功,执行了 git init命令的目录下就会生成 .git 目录。这个 .git 目录里存储着管理当前目录内容所需的仓库数据。 在 Git 中,我们将这个目录的内容称为“附属于该仓库的工作树”。文件的编辑等操作在工作树中进行,然后记录到仓库中,以此管理文件的历史快照。如果想将文件恢复到原先的状态,可以从仓库中调取之前的快照,在工作树中打开。</p>
<h2 id="git-status——查看仓库状态"><a href="#git-status——查看仓库状态" class="headerlink" title="git status——查看仓库状态"></a>git status——查看仓库状态</h2><p>git status命令用于显示 Git 仓库的状态。这是一个十分常用的命令,请务必牢记。</p>
<p>工作树和仓库在被操作的过程中,状态会不断发生变化。在 Git 操作过程中时常用 git status命令查看当前状态,可谓基本中的基本。下面,就让我们来实际查看一下当前状态 :</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git status</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> On branch master</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> Initial commit</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line">nothing to commit (create/copy files and use "git add" to track)</span><br></pre></td></tr></table></figure>
<p>结果显示了我们当前正处于 master 分支下。关于分支我们会在不久后讲到,现在不必深究。接着还显示了没有可提交的内容。所谓提交(Commit),是指“记录工作树中所有文件的当前状态”。</p>
<h2 id="git-add——向暂存区中添加文件"><a href="#git-add——向暂存区中添加文件" class="headerlink" title="git add——向暂存区中添加文件"></a>git add——向暂存区中添加文件</h2><p>要想让文件成为 Git 仓库的管理对象,就需要用 git add命令将其加入暂存区(Stage 或者 Index)中。暂存区是提交之前的一个临时区域。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git add README.md</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> git status</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> On branch master</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> Initial commit</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> Changes to be committed:</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> (use <span class="string">"git rm --cached <file>..."</span> to unstage)</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> new file: README.md</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br></pre></td></tr></table></figure>
<p>将 README.md 文件加入暂存区后, git status命令的显示结果发生了变化。可以看到, README.md 文件显示在 Changes to be committed 中了。</p>
<h2 id="git-commit——保存仓库的历史记录"><a href="#git-commit——保存仓库的历史记录" class="headerlink" title="git commit——保存仓库的历史记录"></a>git commit——保存仓库的历史记录</h2><p>git commit命令可以将当前暂存区中的文件实际保存到仓库的历史记录中。通过这些记录,我们就可以在工作树中复原文件。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git commit -m <span class="string">"First commit"</span></span></span><br><span class="line">[master (root-commit) 9f129ba] First commit</span><br><span class="line">1 file changed, 0 insertions(+), 0 deletions(-)</span><br><span class="line">create mode 100644 README.md</span><br></pre></td></tr></table></figure>
<p>-m 参数后的 “First commit”称作提交信息,是对这个提交的概述。</p>
<h2 id="git-log——查看提交日志"><a href="#git-log——查看提交日志" class="headerlink" title="git log——查看提交日志"></a>git log——查看提交日志</h2><p>git log命令可以查看以往仓库中提交的日志。包括可以查看什 么人在什么时候进行了提交或合并,以及操作前后有怎样的差别。关于合并我们会在后面解说。</p>
<p>我们先来看看刚才的 git commit命令是否被记录了。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git <span class="built_in">log</span></span></span><br><span class="line">commit 9f129bae19b2c82fb4e98cde5890e52a6c546922</span><br><span class="line">Author: hirocaster <[email protected]></span><br><span class="line">Date: Sun May 5 16:06:49 2013 +0900</span><br><span class="line">First commit</span><br></pre></td></tr></table></figure>
<p>如上图所示,屏幕显示了刚刚的提交操作。 commit 栏旁边显示的“9f129b……”是指向这个提交的哈希值。 Git 的其他命令中,在指向提交时会用到这个哈希值。</p>
<p>Author 栏中显示我们给 Git 设置的用户名和邮箱地址。 Date 栏中显示提交执行的日期和时间。再往下就是该提交的提交信息。</p>
<h3 id="只显示提交信息的第一行"><a href="#只显示提交信息的第一行" class="headerlink" title="只显示提交信息的第一行"></a>只显示提交信息的第一行</h3><p>如果只想让程序显示第一行简述信息,可以在 git log命令后加上 –pretty=short。这样一来开发人员就能够更轻松地把握多个提交。</p>
<h3 id="只显示指定目录、文件的日志"><a href="#只显示指定目录、文件的日志" class="headerlink" title="只显示指定目录、文件的日志"></a>只显示指定目录、文件的日志</h3><p>只要在 git log命令后加上目录名,便会只显示该目录下的日志。如果加的是文件名,就会只显示与该文件相关的日志。</p>
<h3 id="显示文件的改动"><a href="#显示文件的改动" class="headerlink" title="显示文件的改动"></a>显示文件的改动</h3><p>如果想查看提交所带来的改动,可以加上 -p参数,文件的前后差别就会显示在提交信息之后。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git <span class="built_in">log</span> -p</span></span><br></pre></td></tr></table></figure>
<p>如上所述, git log命令可以利用多种参数帮助开发者把握以往提交的内容。不必勉强自己一次记下全部参数,每当有想查看的日志就积极去查,慢慢就能得心应手了。</p>
<h2 id="git-diff——查看更改前后的差别"><a href="#git-diff——查看更改前后的差别" class="headerlink" title="git diff——查看更改前后的差别"></a>git diff——查看更改前后的差别</h2><p>git diff命令可以查看工作树、暂存区、最新提交之间的差别。单从字面上可能很难理解,各位不妨跟着笔者的解说亲手试一试。</p>
<h3 id="查看工作树和暂存区的差别"><a href="#查看工作树和暂存区的差别" class="headerlink" title="查看工作树和暂存区的差别"></a>查看工作树和暂存区的差别</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git diff</span></span><br><span class="line">diff --git a/README.md b/README.md</span><br><span class="line">index e69de29..cb5dc9f 100644</span><br><span class="line">--- a/README.md</span><br><span class="line">+++ b/README.md</span><br><span class="line">@@ -0,0 +1 @@</span><br><span class="line">+# Git教程</span><br></pre></td></tr></table></figure>
<p>这里解释一下显示的内容。“+”号标出的是新添加的行,被删除的行则用“-”号标出。我们可以看到,这次只添加了一行 。</p>
<h3 id="查看工作树和最新提交的差别"><a href="#查看工作树和最新提交的差别" class="headerlink" title="查看工作树和最新提交的差别"></a>查看工作树和最新提交的差别</h3><p>要查看与最新提交的差别,请执行以下命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git diff HEAD</span></span><br><span class="line">diff --git a/README.md b/README.md</span><br><span class="line">index e69de29..cb5dc9f 100644</span><br><span class="line">--- a/README.md</span><br><span class="line">+++ b/README.md</span><br><span class="line">@@ -0,0 +1 @@</span><br><span class="line">+# Git教程</span><br></pre></td></tr></table></figure>
<p> <strong>不妨养成这样一个好习惯:在执行 git commit命令之前先执行git diff HEAD命令,查看本次提交与上次提交之间有什么差别,等确认完毕后再进行提交</strong>。这里的 HEAD 是指向当前分支中最新一次提交的指针。</p>
<h1 id="分支操作"><a href="#分支操作" class="headerlink" title="分支操作"></a>分支操作</h1><p>通过灵活运用分支,可以让多人同时高效地进行并行开发。在这里,我们将带大家学习与分支相关的 Git 操作。</p>
<h2 id="git-branch——显示分支一览表"><a href="#git-branch——显示分支一览表" class="headerlink" title="git branch——显示分支一览表"></a>git branch——显示分支一览表</h2><p>git branch命令可以将分支名列表显示,同时可以确认当前所在分支。让我们来实际运行 git branch命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git branch</span></span><br><span class="line">* master</span><br></pre></td></tr></table></figure>
<p>可以看到 master 分支左侧标有“*”(星号),表示这是我们当前所在的分支。也就是说,我们正在 master 分支下进行开发。结果中没有显示其他分支名,表示本地仓库中只存在 master 一个分支。</p>
<h2 id="git-checkout-b——创建、切换分支"><a href="#git-checkout-b——创建、切换分支" class="headerlink" title="git checkout -b——创建、切换分支"></a>git checkout -b——创建、切换分支</h2><p>如果想以当前的 master 分支为基础创建新的分支,我们需要用到git checkout -b命令。</p>
<h3 id="切换到-feature-A-分支并进行提交"><a href="#切换到-feature-A-分支并进行提交" class="headerlink" title="切换到 feature-A 分支并进行提交"></a>切换到 feature-A 分支并进行提交</h3><p>执行下面的命令,创建名为 feature-A 的分支。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout -b feature-A</span></span><br><span class="line">Switched to a new branch 'feature-A'</span><br></pre></td></tr></table></figure>
<p>实际上,连续执行下面两条命令也能收到同样效果。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git branch feature-A</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> git checkout feature-A</span></span><br></pre></td></tr></table></figure>
<p>创建 feature-A 分支,并将当前分支切换为 feature-A 分支。这时再来查看分支列表,会显示我们处于 feature-A 分支下。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git branch</span></span><br><span class="line">* feature-A</span><br><span class="line">master</span><br></pre></td></tr></table></figure>
<p>feature-A 分支左侧标有“*”,表示当前分支为 feature-A。在这个状态下像正常开发那样修改代码、执行 git add命令并进行提交的话,代 码 就 会 提 交 至 feature-A 分 支。 像 这 样 不 断 对 一 个 分 支(例 如feature-A)进行提交的操作,我们称为“培育分支”。</p>
<h3 id="切换回上一个分支"><a href="#切换回上一个分支" class="headerlink" title="切换回上一个分支"></a>切换回上一个分支</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout -</span></span><br></pre></td></tr></table></figure>
<p>像上面这样用“-”(连字符)代替分支名,就可以切换至上一个分支。</p>
<h2 id="特性分支"><a href="#特性分支" class="headerlink" title="特性分支"></a>特性分支</h2><p>Git 与 Subversion(SVN)等集中型版本管理系统不同,创建分支时不需要连接中央仓库,所以能够相对轻松地创建分支。因此,当今大部分工作流程中都用到了特性(Topic)分支。</p>
<p>特性分支顾名思义,是集中实现单一特性(主题),除此之外不进行任何作业的分支。在日常开发中,往往会创建数个特性分支,同时在此之外再保留一个随时可以发布软件的稳定分支。稳定分支的角色通常由 master 分支担当。</p>
<p> 基于特定主题的作业在特性分支中进行,主题完成后再与 master 分支合并。只要保持这样一个开发流程,就能保证 master 分支可以随时供人查看。这样一来,其他开发者也可以放心大胆地从 master 分支创建新的特性分支 。</p>
<h2 id="git-merge——合并分支"><a href="#git-merge——合并分支" class="headerlink" title="git merge——合并分支"></a>git merge——合并分支</h2><p>接下来,我们假设 feature-A 已经实现完毕,想要将它合并到主干分支 master 中。首先切换到 master 分支。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout master</span></span><br><span class="line">Switched to branch 'master'</span><br></pre></td></tr></table></figure>
<p>然后合并 feature-A 分支。为了在历史记录中明确记录下本次分支合并,我们需要创建合并提交。因此,在合并时加上 –no-ff参数。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git merge --no-ff feature-A</span></span><br></pre></td></tr></table></figure>
<p>随后编辑器会启动,用于录入合并提交的信息。</p>
<h2 id="git-log-–graph——以图表形式查看分支"><a href="#git-log-–graph——以图表形式查看分支" class="headerlink" title="git log –graph——以图表形式查看分支"></a>git log –graph——以图表形式查看分支</h2><p>用 git log –graph命令进行查看的话,能很清楚地看到特性分支(feature-A)提交的内容已被合并。除此以外,特性分支的创建以及合并也都清楚明了。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git <span class="built_in">log</span> --graph</span></span><br><span class="line">* commit 83b0b94268675cb715ac6c8a5bc1965938c15f62</span><br><span class="line">|\ Merge: fd0cbf0 8a6c8b9</span><br><span class="line">| | Author: hirocaster <[email protected]></span><br><span class="line">| | Date: Sun May 5 16:37:57 2013 +0900</span><br><span class="line">| |</span><br><span class="line">| | Merge branch 'feature-A'</span><br><span class="line">| |</span><br><span class="line">| * commit 8a6c8b97c8962cd44afb69c65f26d6e1a6c088d8</span><br><span class="line">|/ Author: hirocaster <[email protected]></span><br><span class="line">| Date: Sun May 5 16:22:02 2013 +0900</span><br><span class="line">|</span><br><span class="line">| Add feature-A</span><br><span class="line">|</span><br><span class="line">* commit fd0cbf0d4a25f747230694d95cac1be72d33441d</span><br><span class="line">| Author: hirocaster <[email protected]></span><br><span class="line">| Date: Sun May 5 16:10:15 2013 +0900</span><br><span class="line">|</span><br><span class="line">| Add index</span><br><span class="line">|</span><br><span class="line">* commit 9f129bae19b2c82fb4e98cde5890e52a6c546922</span><br><span class="line">Author: hirocaster <[email protected]></span><br><span class="line">Date: Sun May 5 16:06:49 2013 +0900</span><br><span class="line">First commit</span><br></pre></td></tr></table></figure>
<p>git log –graph命令可以用图表形式输出提交日志,非常直观,请大家务必记住。</p>
<h1 id="更改提交的操作"><a href="#更改提交的操作" class="headerlink" title="更改提交的操作"></a>更改提交的操作</h1><h2 id="git-reset——回溯历史版本"><a href="#git-reset——回溯历史版本" class="headerlink" title="git reset——回溯历史版本"></a>git reset——回溯历史版本</h2><p>Git 的另一特征便是可以灵活操作历史版本。借助分散仓库的优势,可以在不影响其他仓库的前提下对历史版本进行操作。</p>
<p>要让仓库的 HEAD、暂存区、当前工作树回溯到指定状态,需要用到 git rest –hard命令。只要提供目标时间点的哈希值 ,就可以 完全恢复至该时间点的状态。事不宜迟,让我们执行下面的命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git reset --hard fd0cbf0d4a25f747230694d95cac1be72d33441d (使用时这里需要个人更改哈希值)</span></span><br><span class="line">HEAD is now at fd0cbf0 Add index</span><br></pre></td></tr></table></figure>
<p><strong>git log命令只能查看以当前状态为终点的历史日志。所以这里要使用 git reflog命令,查看当前仓库的操作日志。在日志中找出回溯历史之前的哈希值,通过 git reset –hard命令恢复到回溯历史前的状态 。</strong></p>
<h2 id="消除冲突"><a href="#消除冲突" class="headerlink" title="消除冲突"></a>消除冲突</h2><h3 id="查看冲突部分并将其解决"><a href="#查看冲突部分并将其解决" class="headerlink" title="查看冲突部分并将其解决"></a>查看冲突部分并将其解决</h3><p>用编辑器打开 README.md (如果你发生了冲突,查看相应的冲突文件)文件,就会发现其内容变成了下面这个样子。 (这是书上的例子)</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> Git教程</span></span><br><span class="line"><<<<<<< HEAD</span><br><span class="line">- feature-A</span><br><span class="line">=======</span><br><span class="line">- fix-B</span><br><span class="line"><span class="meta">></span><span class="bash">>>>>>> fix-B</span></span><br></pre></td></tr></table></figure>
<p><code>=======</code>以上的部分是当前 HEAD 的内容,以下的部分是要合并的 fix-B 分支中的内容。我们在编辑器中将其改成想要的样子。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> Git教程</span></span><br><span class="line">- feature-A</span><br><span class="line">- fix-B</span><br></pre></td></tr></table></figure>
<p>如上所示,本次修正让 feature-A 与 fix-B 的内容并存于文件之中。但是在实际的软件开发中,往往需要删除其中之一,所以各位在处理冲突时,务必要仔细分析冲突部分的内容后再行修改。</p>
<h3 id="提交解决后的结果"><a href="#提交解决后的结果" class="headerlink" title="提交解决后的结果"></a>提交解决后的结果</h3><p>冲突解决后,执行 git add命令与 git commit命令。</p>
<h2 id="git-commit-–amend——修改提交信息"><a href="#git-commit-–amend——修改提交信息" class="headerlink" title="git commit –amend——修改提交信息"></a>git commit –amend——修改提交信息</h2><h2 id="git-rebase-i——压缩历史"><a href="#git-rebase-i——压缩历史" class="headerlink" title="git rebase -i——压缩历史"></a>git rebase -i——压缩历史</h2><p>在合并特性分支之前,如果发现已提交的内容中有些许拼写错误等,不妨提交一个修改,然后将这个修改包含到前一个提交之中,压缩成一个历史记录。这是个会经常用到的技巧,让我们来实际操作体会一下。</p>
<p>首先,新建一个 feature-C 特性分支。</p>
<p>作为 feature-C 的功能实现,我们在 README.md 文件中添加一行文字,并且故意留下拼写错误,以便之后修正。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout -b feature-C</span></span><br><span class="line">Switched to a new branch 'feature-C'</span><br></pre></td></tr></table></figure>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> Git教程</span></span><br><span class="line">- feature-A</span><br><span class="line">- fix-B</span><br><span class="line">- faeture-C</span><br></pre></td></tr></table></figure>
<p>提交这部分内容。这个小小的变更就没必要先执行 git add命令再执行 git commit命令了,我们<strong>用 git commit -am命令来一次完成这两步操作</strong>。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git commit -am <span class="string">"Add feature-C"</span></span></span><br><span class="line">[feature-C 7a34294] Add feature-C</span><br><span class="line">1 file changed, 1 insertion(+)</span><br></pre></td></tr></table></figure>
<p>现在来修正刚才预留的拼写错误。 然后进行提交。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git commit -am <span class="string">"Fix typo"</span></span></span><br><span class="line">[feature-C 6fba227] Fix typo</span><br><span class="line">1 file changed, 1 insertion(+), 1 deletion(-)</span><br></pre></td></tr></table></figure>
<p>错字漏字等失误称作 typo,所以我们将提交信息记为 “Fix typo”。 实际上,我们不希望在历史记录中看到这类提交,因为健全的历史记录并不需要它们。如果能在最初提交之前就发现并修正这些错误,也就不会出现这类提交了。</p>
<p>我们来更改历史。将 “ Fix typo”修正的内容与之前一次的提交合并,在历史记录中合并为一次完美的提交。为此,我们要用到git rebase命令。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git rebase -i HEAD~2</span></span><br></pre></td></tr></table></figure>
<p>用上述方式执行 git rebase命令,可以选定当前分支中包含HEAD(最新提交)在内的两个最新历史记录为对象,并在编辑器中打开。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">pick 7a34294 Add feature-C</span><br><span class="line">pick 6fba227 Fix typo</span><br><span class="line"><span class="meta">#</span><span class="bash"> Rebase 2e7db6f..6fba227 onto 2e7db6f</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> Commands:</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> p, pick = use commit</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> r, reword = use commit, but edit the commit message</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> e, edit = use commit, but stop <span class="keyword">for</span> amending</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> s, squash = use commit, but meld into previous commit</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> f, fixup = like <span class="string">"squash"</span>, but discard this commit<span class="string">'s log message</span></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> x, <span class="built_in">exec</span> = run <span class="built_in">command</span> (the rest of the line) using shell</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> These lines can be re-ordered; they are executed from top to bottom.</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> If you remove a line here THAT COMMIT WILL BE LOST.</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> However, <span class="keyword">if</span> you remove everything, the rebase will be aborted.</span></span><br><span class="line"><span class="meta">#</span><span class="bash"></span></span><br><span class="line"><span class="meta">#</span><span class="bash"> Note that empty commits are commented out</span></span><br></pre></td></tr></table></figure>
<p>我们将 6fba227 的 Fix typo 的历史记录压缩到 7a34294 的 Add feature-C里。按照下图所示,将 6fba227 左侧的 pick 部分删除,改写为 fixup。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">pick 7a34294 Add feature-C</span><br><span class="line">fixup 6fba227 Fix typo</span><br><span class="line">[detached HEAD 51440c5] Add feature-C</span><br><span class="line">1 file changed, 1 insertion(+)</span><br><span class="line">Successfully rebased and updated refs/heads/feature-C.</span><br></pre></td></tr></table></figure>
<p>这样一来, Fix typo 就从历史中被抹去,也就相当于 Add feature-C中从来没有出现过拼写错误。这算是一种<strong>良性的历史改写</strong>。</p>
<h1 id="推送至远程仓库"><a href="#推送至远程仓库" class="headerlink" title="推送至远程仓库"></a>推送至远程仓库</h1><h2 id="git-remote-add——添加远程仓库"><a href="#git-remote-add——添加远程仓库" class="headerlink" title="git remote add——添加远程仓库"></a>git remote add——添加远程仓库</h2><p>在 GitHub 上创建的仓库路径为“[email protected]:用户名 /git-tutorial.git”。现在我们用 git remote add命令将它设置成本地仓库的远程仓库。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git remote add origin [email protected]:github-book/git-tutorial.git</span></span><br></pre></td></tr></table></figure>
<p>按照上述格式执行 git remote add命令之后, Git 会自动[email protected]:github-book/git-tutorial.git远程仓库的名称设置为 origin(标识符)。</p>
<h2 id="git-push——推送至远程仓库"><a href="#git-push——推送至远程仓库" class="headerlink" title="git push——推送至远程仓库"></a>git push——推送至远程仓库</h2><h3 id="推送至-master-分支"><a href="#推送至-master-分支" class="headerlink" title="推送至 master 分支"></a>推送至 master 分支</h3><p>如果想将当前分支下本地仓库中的内容推送给远程仓库,需要用到git push命令。现在假定我们在 master 分支下进行操作。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git push -u origin master</span></span><br><span class="line">Counting objects: 20, done.</span><br><span class="line">Delta compression using up to 8 threads.</span><br><span class="line">Compressing objects: 100% (10/10), done.</span><br><span class="line">Writing objects: 100% (20/20), 1.60 KiB, done.</span><br><span class="line">Total 20 (delta 3), reused 0 (delta 0)</span><br><span class="line">To [email protected]:github-book/git-tutorial.git</span><br><span class="line">* [new branch] master -> master</span><br><span class="line">Branch master set up to track remote branch master from origin.</span><br></pre></td></tr></table></figure>
<p>像这样执行 git push命令,当前分支的内容就会被推送给远程仓库origin 的 master 分支。 -u参数可以在推送的同时,将 origin 仓库的 master 分支设置为本地仓库当前分支的 upstream(上游)。添加了这个参数,将来运行 git pull命令从远程仓库获取内容时,本地仓库的这个分支就可以直接从 origin 的 master 分支获取内容,省去了另外添加参数的麻烦。执行该操作后,当前本地仓库 master 分支的内容将会被推送到GitHub 的远程仓库中。在 GitHub 上也可以确认远程 master 分支的内容 和本地 master 分支相同。</p>
<h3 id="推送至-master-以外的分支"><a href="#推送至-master-以外的分支" class="headerlink" title="推送至 master 以外的分支"></a>推送至 master 以外的分支</h3><p>除了 master 分支之外,远程仓库也可以创建其他分支。举个例子,我们在本地仓库中创建 feature-D 分支,并将它以同名形式 push 至远程仓库。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout -b feature-D</span></span><br><span class="line">Switched to a new branch 'feature-D'</span><br></pre></td></tr></table></figure>
<p>我们在本地仓库中创建了 feature-D 分支,现在将它 push 给远程仓库并保持分支名称不变。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git push -u origin feature-D</span></span><br><span class="line">Total 0 (delta 0), reused 0 (delta 0)</span><br><span class="line">To [email protected]:github-book/git-tutorial.git</span><br><span class="line">* [new branch] feature-D -> feature-D</span><br><span class="line">Branch feature-D set up to track remote branch feature-D from origin.</span><br></pre></td></tr></table></figure>
<h1 id="从远程仓库获取"><a href="#从远程仓库获取" class="headerlink" title="从远程仓库获取"></a>从远程仓库获取</h1><h2 id="git-clone——获取远程仓库"><a href="#git-clone——获取远程仓库" class="headerlink" title="git clone——获取远程仓库"></a>git clone——获取远程仓库</h2><h3 id="获取远程仓库"><a href="#获取远程仓库" class="headerlink" title="获取远程仓库"></a>获取远程仓库</h3><p>首先我们换到其他目录下,将 GitHub 上的仓库 clone 到本地。注意 不要与之前操作的仓库在同一目录下。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git <span class="built_in">clone</span> [email protected]:github-book/git-tutorial.git</span></span><br><span class="line">Cloning into 'git-tutorial'...</span><br><span class="line">remote: Counting objects: 20, done.</span><br><span class="line">remote: Compressing objects: 100% (7/7), done.</span><br><span class="line">remote: Total 20 (delta 3), reused 20 (delta 3)</span><br><span class="line">Receiving objects: 100% (20/20), done.</span><br><span class="line">Resolving deltas: 100% (3/3), done.</span><br><span class="line"><span class="meta">$</span><span class="bash"> <span class="built_in">cd</span> git-tutorial</span></span><br></pre></td></tr></table></figure>
<p>执行 git clone命令后我们会默认处于 master 分支下,同时系统会自动将 origin 设置成该远程仓库的标识符。也就是说,当前本地仓库的 master 分支与 GitHub 端远程仓库(origin)的 master 分支在内容上是完全相同的。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git branch -a</span></span><br><span class="line">* master</span><br><span class="line">remotes/origin/HEAD -> origin/master</span><br><span class="line">remotes/origin/feature-D</span><br><span class="line">remotes/origin/master</span><br></pre></td></tr></table></figure>
<p>我们用 git branch -a命令查看当前分支的相关信息。添加 -a参数可以同时显示本地仓库和远程仓库的分支信息。<br>结果中显示了 remotes/origin/feature-D,证明我们的远程仓库中已经有了 feature-D 分支 。</p>
<h3 id="获取远程的-feature-D-分支"><a href="#获取远程的-feature-D-分支" class="headerlink" title="获取远程的 feature-D 分支"></a>获取远程的 feature-D 分支</h3><p>我们试着将 feature-D 分支获取至本地仓库。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git checkout -b feature-D origin/feature-D</span></span><br><span class="line">Branch feature-D set up to track remote branch feature-D from origin.</span><br><span class="line">Switched to a new branch 'feature-D'</span><br></pre></td></tr></table></figure>
<p>-b 参数的后面是本地仓库中新建分支的名称。为了便于理解,我们仍将其命名为 feature-D,让它与远程仓库的对应分支保持同名。新建分支名称后面是获取来源的分支名称。例子中指定了 origin/feature-D,就是说以名为 origin 的仓库(这里指 GitHub 端的仓库)的 feature-D 分支为来源,在本地仓库中创建 feature-D 分支。</p>
<h2 id="git-pull——获取最新的远程仓库分支"><a href="#git-pull——获取最新的远程仓库分支" class="headerlink" title="git pull——获取最新的远程仓库分支"></a>git pull——获取最新的远程仓库分支</h2><p>远程仓库的 feature-D 分支中已经有了我们刚刚推送的提交。这时我们就可以使用 git pull 命令,将本地的 feature-D 分支更新到最新状态。当前分支为 feature-D 分支。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> git pull origin feature-D</span></span><br><span class="line">remote: Counting objects: 5, done.</span><br><span class="line">remote: Compressing objects: 100% (1/1), done.</span><br><span class="line">remote: Total 3 (delta 1), reused 3 (delta 1)</span><br><span class="line">Unpacking objects: 100% (3/3), done.</span><br><span class="line">From github.com:github-book/git-tutorial</span><br><span class="line">* branch feature-D -> FETCH_HEAD</span><br><span class="line">First, rewinding head to replay your work on top of it...</span><br><span class="line">Fast-forwarded feature-D to ed9721e686f8c588e55ec6b8071b669f411486b8.</span><br></pre></td></tr></table></figure>
<hr>
<h1 id="如何用Github的gh-pages分支展示自己的项目"><a href="#如何用Github的gh-pages分支展示自己的项目" class="headerlink" title="如何用Github的gh-pages分支展示自己的项目"></a>如何用Github的gh-pages分支展示自己的项目</h1><figure class="highlight maxima"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git subtree <span class="built_in">push</span> --<span class="built_in">prefix</span>=dist <span class="built_in">origin</span> gh-pages</span><br></pre></td></tr></table></figure>
<p>意思就是把指定的dist文件提交到gh-pages分支上</p>
]]></content>
<categories>
<category> Linux杂烩 </category>
<category> Git </category>
</categories>
<tags>
<tag> linux </tag>
<tag> git </tag>
<tag> github </tag>
</tags>
</entry>
<entry>
<title><![CDATA[使用shell创建文本菜单和窗口部件]]></title>
<url>/2017/11/29/shell-create-text-menu-and-window/</url>
<content type="html"><![CDATA[<p><em>来源: Linux命令行与shell脚本编程大全</em></p>
<p>内容:</p>
<blockquote>
<ul>
<li>创建文本菜单</li>
<li>创建文本窗口部件</li>
<li>添加X Window图形</li>
</ul>
</blockquote>
<h2 id="创建文本菜单"><a href="#创建文本菜单" class="headerlink" title="创建文本菜单"></a>创建文本菜单</h2><p>创建交互式shell脚本最常用的方法是使用菜单,它提供了各种选项帮助脚本用户了解脚本能做到的和不能做的。</p>
<p>shell脚本菜单的核心是<code>case</code>命令,该命令会根据用户在菜单上的选择来执行特定命令。</p>
<p>下面我们逐步了解和创建基于菜单的shell脚本的步骤。</p>
<a id="more"></a>
<h3 id="创建菜单布局"><a href="#创建菜单布局" class="headerlink" title="创建菜单布局"></a>创建菜单布局</h3><p><strong>第一步</strong>是决定在菜单上显示哪些元素以及想要显示的布局方式。</p>
<p><strong>在创建菜单前,通常先清空显示器上已有的内容。这样能在干净的,没有干扰的环境中显示菜单了。</strong></p>
<p><code>clear</code>命令使用当前终端的<code>terminfo</code>数据来清理出现在屏幕上的文字。运行<code>clear</code>命令后可以使用<code>echo</code>命令显示菜单元素。</p>
<p><strong>默认,echo命令只显示可打印的文本字符。</strong>而在创建菜单时一些非文本字符也非常有用,比如制表符和换行符。我们需要添加<code>-e</code>选项使得<code>echo</code>命令能解析包含在其中的非文本字符。</p>
<p>例如,</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx:~/tmp$ echo -e "1.\tDisplay disk space"</span><br><span class="line">1. Display disk space</span><br></pre></td></tr></table></figure>
<p>这对于格式化菜单项布局非常方便,只需要几个<code>echo</code>命令就可以创建一个还不错的菜单。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">clear</span><br><span class="line">echo</span><br><span class="line">echo -e "\t\t\tSys Admin Menu\n"</span><br><span class="line">echo -e "\t1. Display disk space"</span><br><span class="line">echo -e "\t2. Display logged on users"</span><br><span class="line">echo -e "\t3. Display memory usage"</span><br><span class="line">echo -e "\t0. Exit menu\n\n"</span><br><span class="line">echo -en "\t\tEnter an option: "</span><br></pre></td></tr></table></figure>
<p>最后一行<code>-en</code>选项去掉末尾换行符使得菜单更专业点,光标会在行尾等待用户输入。</p>
<p><strong>创建菜单的最后一步是获取用户输入。</strong>这一步用<code>read</code>命令。因为我们只期望用户使用单字符输入,在命令加<code>-n</code>选项进行限定。这样用户只需要输入一个数字,不用摁回车键。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">read -n 1 option</span><br></pre></td></tr></table></figure>
<h3 id="创建菜单函数"><a href="#创建菜单函数" class="headerlink" title="创建菜单函数"></a>创建菜单函数</h3><p>shell脚本菜单选项作为一组独立的函数实现起来更为容易。要做到这一点,你要为每个菜单项创建独立的shell函数。<strong>第一步</strong>是决定你希望脚本执行那些功能,然后将这些功能以函数的形式放在代码中。</p>
<p><strong>通常我们会为还没有实现的函数先创建一个<em>桩函数</em>,它是一个控函数,或者只有一个echo语句,说明最终这里需要什么内容。</strong></p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">function diskspace {</span><br><span class="line"> clear</span><br><span class="line"> echo "This is where the diskspace commands will do"</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>这允许菜单在我实现某个函数时仍然能正常操作。不需要我们写出所有函数之后才能让菜单投入使用。函数从<code>clear</code>命令开始,这样我们就能在一个干净的屏幕上执行该函数,不会收到原先菜单的干扰。</p>
<p><strong>另外,将菜单布局本身作为一个函数来创建有利于菜单制作。</strong></p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">function menu {</span><br><span class="line"> clear</span><br><span class="line">trueecho</span><br><span class="line">trueecho -e "\t\t\tSys Admin Menu\n"</span><br><span class="line">trueecho -e "\t1. Display disk space"</span><br><span class="line">trueecho -e "\t2. Display logged on users"</span><br><span class="line">trueecho -e "\t3. Display memory usage"</span><br><span class="line">trueecho -e "\t0. Exit menu\n\n"</span><br><span class="line">trueecho -en "\t\tEnter an option: "</span><br><span class="line">trueread -n 1 option</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>这样我们能在任何时候调用该函数以此重现菜单。</p>
<h3 id="添加菜单逻辑"><a href="#添加菜单逻辑" class="headerlink" title="添加菜单逻辑"></a>添加菜单逻辑</h3><p>下一步我们需要创建程序逻辑将菜单布局和函数结合起来。这需要使用<code>case</code>命令。</p>
<p><code>case</code>命令应该根据菜单中输入的字符来调用相应的函数,用case命令字符星号来处理所有不正确的菜单项。</p>
<p>下面展示了典型菜单的<code>case</code>用法:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">menu</span><br><span class="line">case $option in</span><br><span class="line">0)</span><br><span class="line">truebreak ;;</span><br><span class="line">1)</span><br><span class="line">truediskspace ;;</span><br><span class="line">2)</span><br><span class="line">truewhoseon ;;</span><br><span class="line">3)</span><br><span class="line">truememusage ;;</span><br><span class="line">*)</span><br><span class="line">trueclear</span><br><span class="line">trueecho "Sorry, wrong selection";;</span><br><span class="line">esac</span><br></pre></td></tr></table></figure>
<p>这里首先调用<code>menu</code>函数清空屏幕并显示菜单。<code>menu</code>函数中的<code>read</code>命令会一直等待,知道用户在键盘上键入一个字符。然后<code>case</code>命令会接管余下的处理过程,基于字符调用相应的函数。</p>
<h3 id="整合shell脚本菜单"><a href="#整合shell脚本菜单" class="headerlink" title="整合shell脚本菜单"></a>整合shell脚本菜单</h3><p>现在让我们将前面的步骤全部组合起来,看看它们是如何协作的。</p>
<p>这是一个完整的菜单脚本例子:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx:~/tmp$ cat test14</span><br><span class="line"><span class="meta">#</span><span class="bash">!/bin/bash</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> simple script menu</span></span><br><span class="line"></span><br><span class="line">function diskspace {</span><br><span class="line">trueclear</span><br><span class="line">truedf -k</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">function whoseon {</span><br><span class="line">trueclear</span><br><span class="line">truewho</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">function memusage {</span><br><span class="line">trueclear</span><br><span class="line">truecat /proc/meminfo</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">function menu {</span><br><span class="line">trueclear</span><br><span class="line">trueecho</span><br><span class="line">trueecho -e "\t\t\tSys Admin Menu\n"</span><br><span class="line">trueecho -e "\t1. Display disk space"</span><br><span class="line">trueecho -e "\t2. Display logged on users"</span><br><span class="line">trueecho -e "\t3. Display memory usage"</span><br><span class="line">trueecho -e "\t0. Exit menu\n\n"</span><br><span class="line">trueecho -en "\t\tEnter an option: "</span><br><span class="line">trueread -n 1 option</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">while [ 1 ]</span><br><span class="line">do</span><br><span class="line">truemenu</span><br><span class="line">truecase $option in</span><br><span class="line">true0)</span><br><span class="line">truetruebreak ;;</span><br><span class="line">true1)</span><br><span class="line">truetruediskspace ;;</span><br><span class="line">true2)</span><br><span class="line">truetruewhoseon ;;</span><br><span class="line">true3)</span><br><span class="line">truetruememusage ;;</span><br><span class="line">true*)</span><br><span class="line">truetrueclear</span><br><span class="line">truetrueecho "Sorry, wrong selection" ;;</span><br><span class="line">trueesac</span><br><span class="line">trueecho -en "\n\n\t\t\tHit any key to continue"</span><br><span class="line">trueread -n 1 line</span><br><span class="line">done</span><br><span class="line">clear</span><br></pre></td></tr></table></figure>
<p>使用:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">truetrueSys Admin Menu</span><br><span class="line"></span><br><span class="line">1. Display disk space</span><br><span class="line">2. Display logged on users</span><br><span class="line">3. Display memory usage</span><br><span class="line">0. Exit menu</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">trueEnter an option:</span><br></pre></td></tr></table></figure>
<p>输入1:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">文件系统 1K-块 已用 可用 已用% 挂载点</span><br><span class="line">udev 4006080 0 4006080 0% /dev</span><br><span class="line">tmpfs 807220 81004 726216 11% /run</span><br><span class="line">/dev/sda4 305650672 14226064 275828680 5% /</span><br><span class="line">tmpfs 4036100 1724 4034376 1% /dev/shm</span><br><span class="line">tmpfs 5120 4 5116 1% /run/lock</span><br><span class="line">tmpfs 4036100 0 4036100 0% /sys/fs/cgroup</span><br><span class="line">/dev/sda3 524272 4684 519588 1% /boot/efi</span><br><span class="line">tmpfs 807220 52 807168 1% /run/user/1000</span><br><span class="line">tmpfs 807220 16 807204 1% /run/user/125</span><br><span class="line">/dev/sda2 421886972 23340376 398546596 6% /media/wsx/存储</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">truetruetrueHit any key to continue</span><br></pre></td></tr></table></figure>
<p>其他都可以自己测试一下,我就不赘言了。</p>
<h3 id="使用select命令"><a href="#使用select命令" class="headerlink" title="使用select命令"></a>使用select命令</h3><p><code>select</code>命令只需要一条命令就可以创建出菜单,然后获取输入的答案并自动处理。</p>
<p>命令格式如下:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">select variable in list</span><br><span class="line">do</span><br><span class="line">truecommands</span><br><span class="line">done</span><br></pre></td></tr></table></figure>
<p><strong><code>list</code>参数是由空格分隔的文本选项列表,这些列表构成了整个菜单。</strong><code>select</code>命令会将每个列表项显示成一个带编号的选项,然后为选项显示一个由<code>PS3</code>环境变量定义的特殊提示符。</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx:~/tmp$ cat smenu1</span><br><span class="line"><span class="meta">#</span><span class="bash">!/bin/bash</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> using select <span class="keyword">in</span> the menu</span></span><br><span class="line"></span><br><span class="line">function diskspace {</span><br><span class="line">trueclear</span><br><span class="line">truedf -k</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">function whoseon {</span><br><span class="line">trueclear</span><br><span class="line">truewho</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">function memusage {</span><br><span class="line">trueclear</span><br><span class="line">truecat /proc/meminfo</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">PS3="Enter an option: "</span><br><span class="line">select option in "Display disk space" "Display logged on users" "Display memory usage" "Exit program"</span><br><span class="line">do</span><br><span class="line">truecase $option in</span><br><span class="line">true"Exit program")</span><br><span class="line">truetruebreak ;;</span><br><span class="line">true"Display disk space")</span><br><span class="line">truetruediskspace ;;</span><br><span class="line">true"Display logged on users")</span><br><span class="line">truetruememusage ;;</span><br><span class="line">true"Display memory usage")</span><br><span class="line">truetruememusage ;;</span><br><span class="line">true*)</span><br><span class="line">truetrueclear</span><br><span class="line">truetrueecho "Sorry, wrong selection";;</span><br><span class="line">trueesac</span><br><span class="line">done</span><br><span class="line">clear</span><br></pre></td></tr></table></figure>
<p>运行会自动生成如下菜单项:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wsx@wsx:~/tmp$ ./smenu1</span><br><span class="line">1) Display disk space 3) Display memory usage</span><br><span class="line">2) Display logged on users 4) Exit program</span><br><span class="line">Enter an option:</span><br></pre></td></tr></table></figure>
<p><strong>在使用<code>select</code>命令时,记住存储在变量中的结果值是整个文本字符串而不是跟菜单项相关联的数字。文本字符串是要在<code>case</code>语句中比较的内容。</strong></p>
<h2 id="制作窗口"><a href="#制作窗口" class="headerlink" title="制作窗口"></a>制作窗口</h2><p><code>dialog</code>包能够用ANSI转义控制字符在文本环境中创建标准的窗口对话框。我们可以将这些对话框融入自己的shell脚本中,借此与用户进行交互。这部分我们来学习如何使用<code>dialog</code>包。</p>
<p>安装:</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get install dialog</span><br></pre></td></tr></table></figure>
<h3 id="dialog包"><a href="#dialog包" class="headerlink" title="dialog包"></a>dialog包</h3><p><code>dialog</code>包使用命令行参数来决定生成哪种窗口部件(widget)。部件是dialog包中窗口元素的术语。</p>
<table>
<thead>
<tr>
<th>部件</th>
<th>描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>calendar</td>
<td>提供选择日期的日历</td>
</tr>
<tr>
<td>checklist</td>
<td>显示多个选项(其中每个选项都能打开或关闭)</td>
</tr>
<tr>
<td>form</td>
<td>构建一个带有标签以及文本字段(可以填写内容)的表单</td>
</tr>
<tr>
<td>fselect</td>
<td>提供一个文件选择窗口来浏览选择文件</td>
</tr>
<tr>
<td>gauge</td>
<td>显示完成的百分比进度条</td>
</tr>
<tr>
<td>infobox</td>
<td>显示一条消息,但不用等待回应</td>
</tr>
<tr>
<td>inputbox</td>
<td>提供一个输入文本用的文本表单</td>
</tr>
<tr>
<td>inputmenu</td>
<td>提供一个可编辑的菜单</td>
</tr>
<tr>
<td>menu</td>